5&#39; to 3&#39; exonuclease mutations of thermostable DNA polymerases

ABSTRACT

The present invention relates to thermostable DNA polymerases which have been mutated such that a lesser amount of 5&#39; to 3&#39; exonuclease activity is exhibited from that which is exhibited by the native enzyme.

CROSS-REFERENCE TO RELATED APPLICATIONS

This is a continuation-in-part (CIP) of Ser. Nos. 590,213, now abandonedin favor of continuation application U.S. Ser. No. 08/119,754, filedSep. 10, 1993, 590,466 now abandoned in favor of continuationapplication U.S. Ser. No. 08/113,531, filed Aug. 27, 1993, and 590,490now abandoned, all of which were filed on Sep. 28, 1990, and all ofwhich are CIPs of Ser. No. 525,394, filed May 15, 1990, which issued asU.S. Pat. No. 5,079,352 and which is a CIP of abandoned Ser. No.143,441, filed Jan. 12, 1988, now abandoned in favor of continuationapplication U.S. Ser. No. 07/873,897, filed Apr. 24, 1992, and which isa CIP of Ser. No. 063,509, filed Jun. 17, 1987, which issued as U.S.Pat. No. 4,889,818 and which is a CIP of abandoned Ser. No. 899,241,filed Aug. 22, 1986.

This is a also a CIP of Ser. No. 746,121 filed Aug. 15, 1991 nowabandoned in favor of continuation application U.S. Ser. No. 08/082,182,filed Jun. 24, 1993, and, which is a CIP of: 1) PCT/US90/07641, filedDec. 21, 1990, which is a CIP of Ser. No. 585,471, now abandoned infavor of U.S. Ser. No. 08/080,243, filed Jun. 17, 1993, filed Sep. 20,1990, which is a CIP of Ser. No. 455,611, which has been allowed, filedDec. 22, 1989, which is a CIP of Ser. No. 143,441, now abandoned infavor of continuation application U.S. Ser. No. 07/073,897, filed Apr.24, 1992, filed Jan. 12, 1988 and its ancestors as described above; and2) Ser. No. 609,157, now abandoned, filed Nov. 2, 1990, which is a CIPof Ser. No. 557,517, now abandoned, filed Jul. 24, 1990.

All of the patent applications referenced in this section areincorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to thermostable DNA polymerases which havebeen altered or mutated such that a different level of 5' to 3'exonuclease activity is exhibited from that which is exhibited by thenative enzyme. The present invention also relates to means for isolatingand producing such altered polymerases. Thermostable DNA polymerases areuseful in many recombinant DNA techniques, especially nucleic acidamplification by the polymerase chain reaction (PCR) self-sustainedsequence replication (3SR), and high temperature DNA sequencing.

2. Background Art

Extensive research has been conducted on the isolation of DNApolymerases from mesophilic microorganisms such as E. coli. See, forexample, Bessman et al., 1957, J. Biol. Chem. 223:171-177 and Buttin andKornberg, 1966, J. Biol. Chem. 241:5419-5427.

Somewhat less investigation has been made on the isolation andpurification of DNA polymerases from thermophiles such as Thermusaquaticus, Thermus thermophilus, Thermotoga maritima, Thermus speciessps 17, Thermus species Z05 and Thermosipho africanus. The use ofthermostable enzymes to amplify existing nucleic acid sequences inamounts that are large compared to the amount initially present wasdescribed in U.S. Pat. Nos. 4,683,195 and 4,683,202, which describe thePCR process, both disclosures of which are incorporated herein byreference. Primers, template, nucleoside triphosphates, the appropriatebuffer and reaction conditions, and polymerase are used in the PCRprocess, which involves denaturation of target DNA, hybridization ofprimers, and synthesis of complementary strands. The extension productof each primer becomes a template for the production of the desirednucleic acid sequence. The two patents disclose that, if the polymeraseemployed is a thermostable enzyme, then polymerase need not be addedafter every denaturation step, because heat will not destroy thepolymerase activity.

U.S. Pat. No. 4,889,818, European Patent Publication No. 258,017 and PCTPublication No. 89/06691, the disclosures of which are incorporatedherein by reference, all describe the isolation and recombinantexpression of an ˜94 kDa thermostable DNA polymerase from Thermusaquaticus and the use of that polymerase in PCR. Although T. aquaticusDNA polymerase is especially preferred for use in PCR and otherrecombinant DNA techniques, there remains a need for other thermostablepolymerases.

SUMMARY OF THE INVENTION

In addressing the need for other thermostable polymerases, the presentinventors found that some thermostable DNA polymerases such as thatisolated from Thermus aquaticus (Taq) display a 5' to 3' exonuclease orstructure-dependent single-stranded endonuclease (SDSSE) activity. As isexplained in greater detail below, such 5' to 3' exonuclease activity isundesirable in an enzyme to be used in PCR, because it may limit theamount of product produced and contribute to the plateau phenomenon inthe normally exponential accumulation of product. Furthermore, thepresence of 5' to 3' nuclease activity in a thermostable DNA polymerasemay contribute to an impaired ability to efficiently generate long PCRproducts greater than or equal to 10 kb particularly for G+C-richtargets. In DNA sequencing applications and cycle sequencing applitions,the presence of 5' to 3' nuclease activity may contribute to reductionin desired band intensities and/or generation of spurious or backgroundbands. Finally, the absence of 5' to 3' nuclease activity may facilitatehigher sensitivity allelic discrimination in a combined polymeraseligase chain reaction (PLCR) assay.

However, an enhanced or greater amount of 5' to 3' exonuclease activityin a thermostable DNA polymerase may be desirable in such an enzymewhich is used in a homogeneous assay system for the concurrentamplification and detection of a target nucleic acid sequence.Generally, an enhanced 5' to 3' exonuclease activity is defined anenhanced rate of exonuclease cleavage or an enhanced rate ofnick-translation synthesis or by the displacement of a larger nucleotidefragment before cleavage of the fragment.

Accordingly, the present invention was developed to meet the needs ofthe prior art by providing thermostable DNA polymerases which exhibitaltered 5' to 3' exonuclease activity. Depending on the purpose forwhich the thermostable DNA polymerase will be used, the 5' to 3'exonuclease activity of the polymerase may be altered such that a rangeof 5' to 3' exonuclease activity may be expressed. This range of 5' to3' exonuclease activity extends from an enhanced activity to a completelack of activity. Although enhanced activity is useful in certain PCRapplications, e.g. a homogeneous assay, as little 5' to 3' exonucleaseactivity as possible is desired in thermostable DNA polymerases utilizedin most other PCR applications.

It was also found that both site directed mutagenesis as well asdeletion mutagenesis may result in the desired altered 5' to 3'exonuclease activity in the thermostable DNA polymerases of the presentinvention. Some mutations which alter the exonuclease activity have beenshown to alter the processivity of the DNA polymerase. In manyapplications (e.g. amplification of moderate sized targets in thepresence of a large amount of high complexity genomic DNA) reducedprocessivity may simplify the optimization of PCRs and contribute toenhanced specificity at high enzyme concentration. Some mutations whicheliminate 5' to 3' exonuclease activity do not reduce and may enhancethe processivity of the thermostable DNA polymerase and accordingly,these mutant enzymes may be preferred in other applications (e.g.generation of long PCR products). Some mutations which eliminate the 5'to 3' exonuclease activity simultaneously enhance, relative to the wildtype, the thermoresistance of the mutant thermostable polymerase, andthus, these mutant enzymes find additional utility in the amplificationof G+C-rich or otherwise difficult to denature targets.

Particular common regions or domains of thermostable DNA polymerasegenomes have been identified as preferred sites for mutagenesis toaffect the enzyme's 5' to 3' exonuclease. These domains can be isolatedand inserted into a thermostable DNA polymerase having none or littlenatural 5' to 3' exonuclease activity to enhance its activity. Thus,methods of preparing chimeric thermostable DNA polymerases with altered5' to 3' exonuclease are also encompassed by the present invention.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides DNA sequences and expression vectors thatencode thermostable DNA polymerases which have been mutated to alter theexpression of 5' to 3' exonuclease. To facilitate understanding of theinvention, a number of terms are defined below.

The terms "cell", "cell line", and "cell culture" can be usedinterchangeably and all such designations include progeny. Thus, thewords "transformants" or "transformed cells" include the primarytransformed cell and cultures derived from that cell without regard tothe number of transfers. All progeny may not be precisely identical inDNA content, due to deliberate or inadvertent mutations. Mutant progenythat have the same functionality as screened for in the originallytransformed cell are included in the definition of transformants.

The term "control sequences" refers to DNA sequences necessary for theexpression of an operably linked coding sequence in a particular hostorganism. The control sequences that are suitable for procaryotes, forexample, include a promoter, optionally an operator sequence, a ribosomebinding site, and possibly other sequences. Eucaryotic cells are knownto utilize promoters, polyadenylation signals, and enhancers.

The term "expression system" refers to DNA sequences containing adesired coding sequence and control sequences in operable linkage, sothat hosts transformed with these sequences are capable of producing theencoded proteins. To effect transformation, the expression system may beincluded on a vector; however, the relevant DNA may also be integratedinto the host chromosome.

The term "gene" refers to a DNA sequence that comprises control andcoding sequences necessary for the production of a recoverable bioactivepolypeptide or precursor. The polypeptide can be encoded by a fulllength coding sequence or by any portion of the coding sequence so longas the enzymatic activity is retained.

The term "operably linked" refers to the positioning of the codingsequence such that control sequences will function to drive expressionof the protein encoded by the coding sequence. Thus, a coding sequence"operably linked" to control sequences refers to a configuration whereinthe coding sequences can be expressed under the direction of a controlsequence.

The term "mixture" as it relates to mixtures containing thermostablepolymerases refers to a collection of materials which includes a desiredthermostable polymerase but which can also include other proteins. Ifthe desired thermostable polymerase is derived from recombinant hostcells, the other proteins will ordinarily be those associated with thehost. Where the host is bacterial, the contaminating proteins will, ofcourse, be bacterial proteins.

The term "non-ionic polymeric detergents" refers to surface-activeagents that have no ionic charge and that are characterized for purposesof this invention, by an ability to stabilize thermostable polymeraseenzymes at a pH range of from about 3.5 to about 9.5, preferably from 4to 8.5.

The term "oligonucleotide" as used herein is defined as a moleculecomprised of two or more deoxyribonucleotides or ribonucleotides,preferably more than three, and usually more than ten. The exact sizewill depend on many factors, which in turn depends on the ultimatefunction or use of the oligonucleotide. The oligonucleotide may bederived synthetically or by cloning.

The term "primer" as used herein refers to an oligonucleotide which iscapable of acting as a point of initiation of synthesis when placedunder conditions in which primer extension is initiated. Anoligonucleotide "primer" may occur naturally, as in a purifiedrestriction digest or be produced synthetically. Synthesis of a primerextension product which is complementary to a nucleic acid strand isinitiated in the presence of four different nucleoside triphosphates anda thermostable polymerase enzyme in an appropriate buffer at a suitabletemperature. A "buffer" includes cofactors (such as divalent metal ions)and salt (to provide the appropriate ionic strength), adjusted to thedesired pH.

A primer is single-stranded for maximum efficiency in amplification, butmay alternatively be double-stranded. If double-stranded, the primer isfirst treated to separate its strands before being used to prepareextension products. The primer is usually an oligodeoxyribonucleotide.The primer must be sufficiently long to prime the synthesis of extensionproducts in the presence of the polymerase enzyme. The exact length of aprimer will depend on many factors, such as source of primer and resultdesired, and the reaction temperature must be adjusted depending onprimer length and nucleotide sequence to ensure proper annealing ofprimer to template. Depending on the complexity of the target sequence,an oligonucleotide primer typically contains 15 to 35 nucleotides. Shortprimer molecules generally require lower temperatures to formsufficiently stable complexes with template.

A primer is selected to be "substantially" complementary to a strand ofspecific sequence of the template. A primer must be sufficientlycomplementary to hybridize with a template strand for primer elongationto occur. A primer sequence need not reflect the exact sequence of thetemplate. For example, a non-complementary nucleotide fragment may beattached to the 5' end of the primer, with the remainder of the primersequence being substantially complementary to the strand.Non-complementary bases or longer sequences can be interspersed into theprimer, provided that the has primer sequence sufficient complementaritywith the sequence of the template to hybridize and thereby form atemplate primer complex for synthesis of the extension product of theprimer.

The terms "restriction endonucleases" and "restriction enzymes" refer tobacterial enzymes which cut double-stranded DNA at or near a specificnucleotide sequence.

The term "thermostable polymerase enzyme" refers to an enzyme which isstable to heat and is heat resistant and catalyzes (facilitates)combination of the nucleotides in the proper manner to form primerextension products that are complementary to a template nucleic acidstrand. Generally, synthesis of a primer extension product begins at the3' end of the primer and proceeds in the 5' direction along the templatestrand, until synthesis terminates.

In order to further facilitate understanding of the invention, specificthermostable DNA polymerase enzymes are referred to throughout thespecification to exemplify the broad concepts of the invention, andthese references are not intended to limit the scope of the invention.The specific enzymes which are frequently referenced are set forth belowwith a common abbreviation which will be used in the specification andtheir respective nucleotide and amino acid Sequence ID numbers.

    ______________________________________                                        Thermostable DNA                                                                             Common                                                         Polymerase     Abbr.     SEQ. ID NO:                                          ______________________________________                                        Thermos aquaticus                                                                            Tag       SEQ ID NO: 1 (nuc)                                                            SEQ ID NO: 2 (a.a.)                                  Thermotoga maritima                                                                          Tma       SEQ ID NO: 3 (nuc)                                                            SEQ ID NO: 4 (a.a.)                                  Thermus species sps17                                                                        Tsps17    SEQ ID NO: 5 (nuc)                                                            SEQ ID NO: 6 (a.a.)                                  Thermus species Z05                                                                          TZ05      SEQ ID NO: 7 (nuc)                                                            SEQ ID NO: 8 (a.a.)                                  Thermus thermophilus                                                                         Tth       SEQ ID NO: 9 (nuc)                                                            SEQ ID NO: 10 (a.a.)                                 Thermosipho africanus                                                                        Taf       SEQ ID NO: 11 (nuc)                                                           SEQ ID NO: 12 (a.a.)                                 ______________________________________                                    

As summarized above, the present invention relates to thermostable DNApolymerases which exhibit altered 5' to 3' exonuclease activity fromthat of the native polymerase. Thus, the polymerases of the inventionexhibit either an enhanced 5' to 3' exonuclease activity or anattenuated 5' to 3' exonuclease activity from that of the nativepolymerase.

Thermostable DNA Polymerases with Attenuated 5' to 3' ExonucleaseActivity

DNA polymerases often possess multiple functions. In addition to thepolymerization of nucleotides E. coli DNA polymerase I (pol I), forexample, catalyzes the pyrophosphorolysis of DNA as well as thehydrolysis of phosphodiester bonds. Two such hydrolytic activities havebeen characterized for pol I; one is a 3' to 5' exonuclease activity andthe other a 5' to 3' exonuclease activity. The two exonucleaseactivities are associated with two different domains of the pol Imolecule. However, the 5' to 3' exonuclease activity of pol I differsfrom that of thermostable DNA polymerases in that the 5' to 3'exonuclease activity of thermostable DNA polymerases has stricterstructural requirements for the substrate on which it acts.

An appropriate and sensitive assay for the 5' to 3' exonuclease activityof thermostable DNA polymerases takes advantage of the discovery of thestructural requirement of the activity. An important feature of thedesign of the assay is an upstream oligonucleotide primer whichpositions the polymerase appropriately for exonuclease cleavage of alabeled downstream oligonucleotide probe. For an assay ofpolymerization-independent exonuclease activity (i.e., an assayperformed in the absence of deoxynucleoside triphosphates) the probemust be positioned such that the region of probe complementary to thetemplate is immediately adjacent to the 3'-end of the primer.Additionally, the probe should contain at least one, but preferably2-10, or most preferably 3-5 nucleotides at the 5'-end of the probewhich are not complementary to the template. The combination of theprimer and probe when annealed to the template creates a double strandedstructure containing a nick with a 3'-hydroxyl 5' of the nick, and adisplaced single strand 3' of the nick. Alternatively, the assay can beperformed as a polymerization-dependent reaction, in which case eachdeoxynucleoside triphosphate should be included at a concentration ofbetween 1 μM and 2 mM, preferably between 10 μM and 200 μM, althoughlimited dNTP addition (and thus limited dNTP inclusion) may be involvedas dictated by the template sequence. When the assay is performed in thepresence of dNTPs, the necessary structural requirements are an upstreamoligonucleotide primer to direct the synthesis of the complementarystrand of the template by the polymerase, and a labeled downstreamoligonucleotide probe which will be contacted by the polymerase in theprocess of extending the upstream primer. An example of apolymerization-independent thermostable DNA polymerase 5' to 3'exonuclease assay follows.

The synthetic 3' phosphorylated oligonucleotide probe (phosphorylated topreclude polymerase extension) BW33 (GATCGCTGCGCGTAACCACCACACCCGCCGCGCp)(SEQ ID NO:13) (100 pmol) was ³² P-labeled at the 5' end with gamma-[³²P] ATP (3000 Ci/mmol) and T4 polynucleotide kinase. The reaction mixturewas extracted with phenol:chloroform:isoamyl alcohol, followed byethanol precipitation. The ³² P-labeled oligonucleotide probe wasredissolved in 100 μl of TE buffer, and unincorporated ATP was removedby gel filtration chromatography on a Sephadex G-50 spin column. Fivepmol of ³² P-labeled BW33 probe, was annealed to 5 pmol of single-strandM13mp10w DNA, in the presence of 5 pmol of the synthetic oligonucleotideprimer BW37 (GCGCTAGGGCGCTGGCAAGTGTAGCGGTCA) (SEQ ID NO:14) in a 100 μlreaction containing 10 mM Tris-HCl (pH 8.3), 50 mM KCl, and 3 mM MgCl₂.The annealing mixture was heated to 95° C. for 5 minutes, cooled to 70 °C. over 10 minutes, incubated at 70° C. for an additional 10 minutes,and then cooled to 25° C. over a 30 minute period in a Perkin-ElmerCetus DNA Thermal Cycler. Exonuclease reactions containing 10 μl of theannealing mixture were pre-incubated at 70° C. for 1 minute.Thermostable DNA polymerase enzyme (approximately 0.01 to 1 unit of DNApolymerase activity, or 0.0005 to 0.05 pmol of enzyme) was added in a2.5 μl volume to the pre-incubation reaction, and the reaction mixturewas incubated at 70° C. Aliquots (5 μl) were removed after 1 minute and5 minutes, and stopped by the addition of 1 μl of 60 mM EDTA. Thereaction products were analyzed by homochromatography and exonucleaseactivity was quantified following autoradiography. Chromatography wascarried out in a homochromatography mix containing 2% partiallyhydrolyzed yeast RNA in 7M urea on Polygram CEL 300 DEAE cellulose thinlayer chromatography plates. The presence of 5' to 3' exonucleaseactivity results in the generation of small ³² P-labeled oligomers,which migrate up the TLC plate, and are easily differentiated on theautoradiogram from undegraded probe, which remains at the origin.

The 5' to 3' exonuclease activity of the thermostable DNA polymerasesexcises 5' terminal regions of double-stranded DNA releasing 5'-mono-and oligonucleotides in a sequential manner. The preferred substrate forthe exonuclease is displaced single-stranded DNA, with hydrolysis of thephosphodiester bond occurring between the displaced single-stranded DNAand the double-helical DNA. The preferred exonuclease cleavage site is aphosphodiester bond in the double helical region. Thus, the exonucleaseactivity can be better described as a structure-dependentsingle-stranded endonuclease (SDSSE).

Many thermostable polymerases exhibit this 5' to 3' exonucleaseactivity, including the DNA polymerases of Taq, Tma, Tsps17, TZ05, Tthand Taf. When thermostable polymerases which have 5' to 3' exonucleaseactivity are utilized in the PCR process, a variety of undesirableresults have been observed including a limitation of the amount ofproduct produced, an impaired ability to generate long PCR products oramplify regions containing significant secondary structure, theproduction of shadow bands or the attenuation in signal strength ofdesired termination bands during DNA sequencing, the degradation of the5'-end of oligonucleotide primers in the context of double-strandedprimer-template complex, nick-translation synthesis duringoligonucleotide-directed mutagenesis and the degradation of the RNAcomponent of RNA:DNA hybrids.

The limitation of the amount of PCR product produced is attributable toa plateau phenomenon in the otherwise exponential accumulation ofproduct. Such a plateau phenomenon occurs in part because 5' to 3'exonuclease activity causes the hydrolysis or cleavage of phosphodiesterbonds when a polymerase with 5' to 3' exonuclease activity encounters aforked structure on a PCR substrate.

Such forked structures commonly exist in certain G- and C-rich DNAtemplates. The cleavage of these phosphodiester bonds under thesecircumstances is undesirable as it precludes the amplification ofcertain G- and C-rich targets by the PCR process. Furthermore, thephosphodiester bond cleavage also contributes to the plateau phenomenonin the generation of the later cycles of PCR when product strandconcentration and renaturation kinetics result in forked structuresubstrates.

In the context of DNA sequencing, the 5' to 3' exonuclease activity ofDNA polymerases is again a hinderance with forked structure templatesbecause the phosphodiester bond cleavage during the DNA extensionreactions results in "false stops". These "false stops" in turncontribute to shadow bands, and in extreme circumstances may result inthe absence of accurate and interpretable sequence data.

When utilized in a PCR process with double-stranded primer-templatecomplex, the 5' to 3' exonuclease activity of a DNA polymerase mayresult in the degradation of the 5'-end of the oligonucleotide primers.This activity is not only undesirable in PCR, but also in second-strandcDNA synthesis and sequencing processes.

During optimally efficient oligonucleotide-directed mutagenesisprocesses, the DNA polymerase which is utilized must not havestrand-displacement synthesis and/or nick-translation capability. Thus,the presence of 5' to 3' exonuclease activity in a polymerase used foroligonucleotide-directed mutagenesis is also undesirable.

Finally, the 5' to 3' exonuclease activity of polymerases generally alsocontains an inherent RNase H activity. However, when the polymerase isalso to be used as a reverse transcriptase, as in a PCR processincluding an RNA:DNA hybrid, such an inherent RNase H activity may bedisadvantageous.

Thus, one aspect of this invention involves the generation ofthermostable DNA polymerase mutants displaying greatly reduced,attenuated or completely eliminated 5' to 3' exonuclease activity. Suchmutant thermostable DNA polymerases will be more suitable and desirablefor use in processes such as PCR, second-strand cDNA synthesis,sequencing and oligonucleotide-directed mutagenesis.

The production of thermostable DNA polymerase mutants with attenuated oreliminated 5' to 3' exonuclease activity may be accomplished byprocesses such as site-directed mutagenesis and deletion mutagenesis.

For example, a site-directed mutation of G to A in the second positionof the codon for Gly at residue 46 in the Taq DNA polymerase amino acidsequence (i.e. mutation of G(137) to A) in the DNA sequence has beenfound to result in an approximately 1000-fold reduction of 5' to 3'exonuclease activity with no apparent change in polymerase activity,processivity or extension rate. This site-directed mutation of the TaqDNA polymerase nucleotide sequence results in an amino acid change ofGly (46) to Asp.

Glycine 46 of Taq DNA polymerase is conserved in Thermus species sps17DNA polymerase, but is located at residue 43, and the same Gly to Aspmutation has a similar effect on the 5' to 3' exonuclease activity ofTsps17 DNA polymerase. Such a mutation of the conserved Gly of Tth (Gly46), TZ05 (Gly 46), Tma (Gly 37) and Taf (Gly 37) DNA polymerases to Aspalso has a similar attenuating effect on the 5' to 3' exonucleaseactivities of those polymerases.

Tsps17 Gly 43, Tth Gly 46, TZ05 Gly 46, Tma Gly 37 and Taf Gly 37 arealso found in a conserved A(V/T)YG (SEQ ID NO:15) sequence domain, andchanging the glycine to aspartic acid within this conserved sequencedomain of any polymerase is also expected to attenuate 5' to 3'exonuclease activity. Specifically, Tsps17 Gly 43, Tth Gly 46, TZ05 Gly46, and Taf Gly 37 share the AVYG sequence domain, and Tma Gly 37 isfound in the ATYG domain. Mutations of glycine to aspartic acid in otherthermostable DNA polymerases containing the conserved A(V/T)YG (SEQ IDNO:15) domain can be accomplished utilizing the same principles andtechniques used for the site-directed mutagenesis of Taq polymerase.Exemplary of such site-directed mutagenesis techniques are Example 5 ofU.S. Ser. No. 523,394, filed May 15, 1990, which issued as U.S. Pat. No.5,079,352, Example 4 of PCT/US91/07076, which published on Apr. 16,1992, filed Sep. 27, 1991, Examples 4 and 5 of U.S. Ser. No. 455,967,filed Dec. 22, 1989, which was filed in the PCT as PCT/US90/07639, andwhich published on Jul. 11, 1991, and Examples 5 and 8 of PCTApplication No. 91/05753, filed Aug. 13, 1991, which published on Mar.5, 1992, each of which are incorporated herein by reference.

Such site-directed mutagenesis is generally accomplished bysite-specific primer-directed mutagenesis. This technique is nowstandard in the art, and is conducted using a synthetic oligonucleotideprimer complementary to a single-stranded phage DNA to be mutagenizedexcept for limited mismatching, representing the desired mutation.Briefly, the synthetic oligonucleotide is used as a primer to directsynthesis of a strand complementary to the phasmid or phage, and theresulting double-stranded DNA is transformed into a phage-supportinghost bacterium. Cultures of the transformed bacteria are plated in topagar, permitting plaque formation from single cells that harbor thephage or plated on drug selective media for phasmid vectors.

Theoretically, 50% of the new plaques will contain the phage having, asa single strand, the mutated form; 50% will have the original sequence.The plaques are tranferred to nitrocellulose filters and the "lifts"hybridized with kinased synthetic primer at a temperature that permitshybridization of an exact match, but at which the mismatches with theoriginal strand are sufficient to prevent hybridization. Plaques thathybridize with the probe are then picked and cultured, and the DNA isrecovered.

In the constructions set forth below, correct ligations for plasmidconstruction are confirmed by first transforming E. coli strains DG98,DG101, DG116, or other suitable hosts, with the ligation mixture.Successful transformants are selected by ampicillin, tetracycline orother antibiotic resistance or using other markers, depending on themode of plasmid construction, as is understood in the art. Plasmids fromthe transformants are then prepared according to the method of Clewell,D. B., et al., Proc. Natl. Acad. Sci. (USA) (1969) 62:1159, optionallyfollowing chloramphenicol amplification (Clewell, D. B., J. Bacteriol.(1972) 110:667). The isolated DNA is analyzed by restriction and/orsequenced by the dideoxy method of Sanger, F., et al., Proc. Natl. Acad.Sci. (USA) (1977) 74:5463 as further described by Messing, et al.,Nucleic Acids Res. (1981) 9:309, or by the method of Maxam, et al.,Methods in Enzymology (1980) 65:499.

For cloning and sequencing, and for expression of constructions undercontrol of most lac or P_(L) promoters, E. coli strains DG98, DG101,DG116 were used as the host. For expression under control of the P_(L)N_(RBS) promoter, E. coli strain K12 MC1000 lambda lysogen, N₇ N₅₃ cI857SusP₈₀, ATCC 39531 may be used. Exemplary hosts used herein forexpression of the thermostable DNA polymerases with altered 5' to 3'exonuclease activity are E. coli DG116, which was deposited with ATCC(ATCC 53606) on Apr. 7, 1987 and E. coli KB2, which was deposited withATCC (ATCC 53075) on Mar. 29, 1985.

For M13 phage recombinants, E. coli strains susceptible to phageinfection, such as E. coli K12 strain DG98, are employed. The DG98strain has been deposited with ATCC Jul. 13, 1984 and has accessionnumber 39768.

Mammalian expression can be accomplished in COS-7 COS-A2, CV-1, andmurine cells, and insect cell-based expression in Spodoptera frugipeida.

The thermostable DNA polymerases of the present invention are generallypurified from E. coli strain DG116 containing the features of plasmidpLSG33. The primary features are a temperature regulated promoter (λP_(L) promoter), a temperature regulated plasmid vector, a positiveretro-regulatory element (PRE) (see U.S. Pat. No. 4,666,848, issued May19, 1987), and a modified form of a thermostable DNA polymerase gene. Asdescribed at page 46 of the specification of U.S. patent applicationSer. No. 455,967, which was filed in the PCT as PCT/US90/07639, andwhich published on Jul. 11, 1991, pLSG33 was prepared by ligating theNdeI-BamHI restriction fragment of pLSG24 into expression vector pDG178.The resulting plasmids are ampicillin resistant and capable ofexpressing 5' to 3' exonuclease deficient forms of the thermostable DNApolymerases of the present invention. The seed flask for a 10 literfermentation contains tryptone (20 g/l), yeast extract (10 g/l), NaCl(10 g/l) and 0.005% ampicillin. The seed flask is inoculated fromcolonies from an agar plate, or a frozen glycerol culture stock can beused. The seed is grown to between 0.5 and 1.0 O.D. (A₆₈₀). The volumeof seed culture inoculated into the fermentation is calculated such thatthe final concentration of bacteria will be 1 mg dry weight/liter. The10 liter growth medium contained 25 mM KH₂ PO₄, 10 mM (NH₄)₂ SO₄, 4 mMsodium citrate, 0.4 mM FeCl₂, 0.04 mM ZnCl₂, 0.03 mM CoCl₂, 0.03 mMCuCl₂, and 0.03 mM H₃ BO₃. The following sterile components are added: 4mM MgSO₄, 20 g/l glucose, 20 mg/l thiamine-HCl and 50 mg/l ampicillin.The pH was adjusted to 6.8 with NaOH and controlled during thefermentation by added NH₄ OH. Glucose is continually added during thefermentation by coupling to NH₄ OH addition. Foaming is controlled bythe addition of polypropylene glycol as necessary, as an anti-foamingagent. Dissolved oxygen concentration is maintained at 40%.

The fermentation is inoculated as described above and the culture isgrown at 30° C. until an optical density of 21 (A₆₈₀) is reached. Thetemperature is then raised to 37° C. to induce synthesis of the desiredpolymerase. Growth continues for eight hours after induction, and thecells are then harvested by concentration using cross flow filtrationfollowed by centrifugation. The resulting cell paste is frozen at -70°C. and yields about 500 grams of cell paste. Unless otherwise indicated,all purification steps are conducted at 4° C.

A portion of the frozen (-70° C.) E. coli K12 strain DG116 harboringplasmid pLSG33 or other suitable host as described above is warmedovernight to -20° C. To the cell pellet the following reagents areadded: 1 volume of 2× TE (100 mM Tris-HCl, pH 7.5, 20 mM EDTA), 1 mg/mlleupeptin and 144 mM PMSF (in dimethyl formamide). The finalconcentration of leupeptin was 1 μg/ml and for PMSF, 2.4 mM. Preferably,dithiothreitol (DTT) is included in TE to provide a final concentrationof 1 mM DTT. The mixture is homogenized at low speed in a blender. Allglassware is baked prior to use, and solutions used in the purificationare autoclaved, if possible, prior to use. The cells are lysed bypassage twice through a Microfluidizer at 10,000 psi.

The lysate is diluted with 1× TE containing 1 mM DTT to a final volumeof 5.5× cell wet weight. Leupeptin is added to 1 μg/ml and PMSF is addedto 2.4 mM. The final volume (Fraction I) is approximately 1540 ml.

Ammonium sulfate is gradually added to 0.2M (26.4 g/l) and the lysatestirred. Upon addition of ammonium sulfate, a precipitate forms which isremoved prior to the polyethylenimine (PEI) precipitation step,described below. The ammonium sulfate precipitate is removed bycentrifugation of the suspension at 15,000-20,000 xg in a JA-14 rotorfor 20 minutes. The supernatant is decanted and retained. The ammoniumsulfate supernatant is then stirred on a heating plate until thesupernatant reaches 75° C. and then is placed in a 77° C. bath and heldthere for 15 minutes with occasional stirring. The supernatant is thencooled in an ice bath to 20° C. and a 10 ml aliquot is removed for PEItitration.

PEI titration and agarose gel electrophoresis are used to determine that0.3% PEI (commercially available from BDH as PolyminP) precipitates ˜90%of the macromolecular DNA and RNA, i.e., no DNA band is visible on anethidium bromide stained agarose gel after treatment with PEI. PEI isadded slowly with stirring to 0.3% from a 10% stock solution. The PEItreated supernatant is centrifuged at 10,000 RPM (17,000 xg) for 20minutes in a JA-14 rotor. The supernatant is decanted and retained. Thevolume (Fraction II) is approximately 1340 ml.

Fraction II is loaded onto a 2.6×13.3 cm (71 ml) phenyl sepharose CL-4B(Pharmacia-LKB) column following equilibration with 6 to 10 columnvolumes of TE containing 0.2M ammonium sulfate. Fraction II is thenloaded at a linear flow rate of 10 cm/hr. The flow rate is 0.9 ml/min.The column is washed with 3 column volumes of the equilibration bufferand then with 2 column volumes of TE to remove contaminating non-DNApolymerase proteins. The recombinant thermostable DNA polymerase iseluted with 4 column volumes of 2.5M urea in TE containing 20% ethyleneglycol. The DNA polymerase containing fractions are identified byoptical absorption (A₂₈₀), DNA polymerase activity assay and SDS-PAGEaccording to standard procedures. Peak fractions are pooled and filteredthrough a 0.2 micron sterile vacuum filtration apparatus. The volume(Fraction III) is approximately 195 ml. The resin is equilibrated andrecycled according to the manufacturer's recommendations.

A 2.6×1.75 cm (93 ml) heparin sepharose Cl-6B column (Pharmacia-LKB) isequilibrated with 6-10 column volumes of 0.05M KCl, 50 mM Tris-HCl, pH7.5, 0.1 mM EDTA and 0.2% Tween 20 , at 1 column volume/hour.Preferably, the buffer contains 1 mM DTT. The column is washed with 3column volumes of the equilibration buffer. The desired thermostable DNApolymerase of the invention is eluted with a 10 column volume lineargradient of 50-750 mM KCl gradient in the same buffer. Fractions(one-tenth column volume) are collected in sterile tubes and thefractions containing the desired thermostable DNA polymerase are pooled(Fraction IV, volume 177 ml).

Fraction IV is concentrated to 10 ml on an Amicon YM30 membrane. Forbuffer exchange, diafiltration is done 5 times with 2.5× storage buffer(50 mM Tris-HCl, pH 7.5, 250 mM KCl, 0.25 mM EDTA 2.5 mM DTT and 0.5%Tween-20 ) by filling the concentrator to 20 ml and concentrating thevolumes to 10 ml each time. The concentrator is emptied and rinsed with10 ml 2.5× storage buffer which is combined with the concentrate toprovide Fraction V.

Anion exchange chromatography is used to remove residual DNA. Theprocedure is conducted in a biological safety hood and steriletechniques are used. A Waters Sep-Pak plus QMA cartridge with a 0.2micron sterile disposable syringe tip filter unit is equilibrated with30 ml of 2.5× storage buffer using a syringe at a rate of about 5 dropsper second. Using a disposable syringe, Fraction V is passed through thecartridge at about 1 drop/second and collected in a sterile tube. Thecartridge is flushed with 5 ml of 2.5 ml storage buffer and pushed drywith air. The eluant is diluted 1.5× with 80% glycerol and stored at-20° C. The resulting final Fraction VI pool contains activethermostable DNA polymerase with altered 5' to 3' exonuclease activity.

In addition to site-directed mutagenesis of a nucleotide sequence,deletion mutagenesis techniques may also be used to attenuate the 5' to3' exonuclease activity of a thermostable DNA polymerase. One example ofsuch a deletion mutation is the deletion of all amino terminal aminoacids up to and including the glycine in the conserved A(V/T)YG (SEQ IDNO:15) domain of thermostable DNA polymerases.

A second deletion mutation affecting 5' to 3' exonuclease activity is adeletion up to Ala 77 in Taq DNA polymerase. This amino acid (Ala 77)has been identified as the amino terminal amino acid in an approximately85.5 kDa proteolytic product of Taq DNA polymerase. This proteolyticproduct has been identified in several native Taq DNA polymerasepreparations and the protein appears to be stable. Since such a deletionup to Ala 77 includes Gly 46, it will also affect the 5' to 3'exonuclease activity of Taq DNA polymerase.

However, a deletion mutant beginning with Ala 77 has the added advantageover a deletion mutant beginning with phenylalanine 47 in that theproteolytic evidence suggests that the peptide will remain stable.Furthermore, Ala 77 is found within the sequence HEAYG (SEQ ID NO:16) 4amino acids prior to the sequence YKA in Taq DNA polymerase. A similarsequence motif HEAYE (SEQ ID NO:17) is found in Tth DNA polymerase, TZ05DNA polymerase and Tsps17 DNA polymerase. The alanine is 4 amino acidsprior to the conserved motif YKA. The amino acids in the other exemplarythermostable DNA polymerases which correspond to Taq Ala 77 are Tth Ala78, TZ05 Ala 78, Tsps17 Ala 74, Tma Leu 72 and Taf Ile 73. A deletion upto the alanine or corresponding amino acid in the motif HEAY(G/E) (SEQID NO:16 or SEQ ID NO:17) in a Thermus species thermostable DNApolymerase containing this sequence will attenuate its 5' to 3'exonuclease activity. The 5' to 3' exonuclease motif YKA is alsoconserved in Tma DNA polymerase (amino acids 76-78) and Taf DNApolymerase ( amino acids 77-79 ) . In this thermostable polymerasefamily, the conserved motif (L/I) LET (SEQ ID NO: 18) immediatelyproceeds the YKA motif. Taf DNA polymerase Ile 73 is 4 residues prior tothis YKA motif while TMA DNA polymerase Leu 72 is 4 residues prior tothe YKA motif. A deletion of the Leu or Ile in the motif (L/I)LETYKA(SEQ ID NO:19) in a thermostable DNA polymerase from the Thermotoga orThermosipho genus will also attenuate 5' to 3' exonuclease activity.

Thus, a conserved amino acid sequence which defines the 5' to 3'exonuclease activity of DNA polymerases of the Thermus genus as well asthose of Thermotoga and Thermosipho has been identified as (I/L/A)X₃ YKA(SEQ ID NO:20), wherein X₃ is any sequence of three amino acids.Therefore, the 5' to 3' exonuclease activity of thermostable DNApolymerases may also be altered by mutating this conserved amino aciddomain.

Those of skill in the art recognize that when such a deletion mutant isto be expressed in recombinant host cells, a methionine codon is usuallyplaced at the 5' end of the coding sequence, so that the amino terminalsequence of the deletion mutant protein would be MET-ALA in the Thermusgenus examples above.

The preferred techniques for performing deletion mutations involveutilization of known restriction sites on the nucleotide sequence of thethermostable DNA polymerase. Following identification of the particularamino acid or amino acids which are to be deleted, a restriction site isidentified which when cleaved will cause the cleavage of the target DNAsequence at a position or slightly 3' distal to the positioncorresponding to the amino acid or domain to be deleted, but retainsdomains which code for other properties of the polymerase which aredesired.

Alternatively, restriction sites on either side (5' or 3') of thesequence coding for the target amino acid or domain may be utilized tocleave the sequence. However, a ligation of the two desired portions ofthe sequence will then be necessary. This ligation may be performedusing techniques which are standard in the art and exemplified inExample 9 of Ser. No. 523,394, filed May 15, 1990, which issued as U.S.Pat. No. 5,079,352, Example 7 of PCT Application No. 91/05753, filedAug. 13, 1991, which published on Mar. 5, 1992, and Ser. No. 590,490,filed Sep. 28, 1990, all of which are incorporated herein by reference.

Another technique for achieving a deletion mutation of the thermostableDNA polymerase is by utilizing the PCR mutagenesis process. In thisprocess, primers are prepared which incorporate a restriction site andoptionally a methionine codon if such a codon is not already present.Thus, the product of the PCR with this primer may be digested with anappropriate restriction enzyme to remove the domain which codes for 5'to 3' exonuclease activity of the enzyme. Then, the two remainingsections of the product are ligated to form the coding sequence for athermostable DNA polymerase lacking 5' to 3' exonuclease activity. Suchcoding sequences can be utilized in expression vectors in appropriatehost cells to produce the desired thermostable DNA polymerase lacking 5'to 3' exonuclease activity.

In addition to the Taq DNA polymerase mutants with reduced 5' to 3'exonuclease activity, it has also been found that a truncated Tma DNApolymerase with reduced 5' to 3' exonuclease activity may be produced byrecombinant techniques even when the complete coding sequence of the TmaDNA polymerase gene is present in an expression vector in E. coli. Sucha truncated Tma DNA polymerase is formed by translation starting withthe methionine codon at position 140. Furthermore, recombinant means maybe used to produce a truncated polymerase corresponding to the proteinproduced by initiating translation at the methionine codon at position284 of the Tma coding sequence.

The Tma DNA polymerase lacking amino acids 1 though 139 (about 86 kDa),and the Tma DNA polymerase lacking amino acids 1 through 283 (about 70kDa) retain polymerase activity but have attenuated 5' to 3' exonucleaseactivity. An additional advantage of the 70 kDa Tma DNA polymerase isthat it is significantly more thermostable than native Tma polymerase.

Thus, it has been found that the entire sequence of the intact Tma DNApolymerase I enzyme is not required for activity. Portions of the ha DNApolymerase I coding sequence can be used in recombinant DNA techniquesto produce a biologically active gene product with DNA polymeraseactivity.

Furthermore, the availability of DNA encoding the Tma DNA polymerasesequence provides the opportunity to modify the coding sequence so as togenerate mutein (mutant protein) forms also having DNA polymeraseactivity but with attenuated 5' to 3' exonuclease activity. Theamino(N)-terminal portion of the Tma DNA polymerase is not necessary forpolymerase activity but rather encodes the 5' to 3' exonuclease activityof the protein.

Thus, using recombinant DNA methodology, one can delete approximately upto one-third of the N-terminal coding sequence of the Tma gene, clone,and express a gene product that is quite active in polymerase assaysbut, depending on the extent of the deletion, has no 5' to 3'exonuclease activity. Because certain N-terminal shortened forms of thepolymerase are active, the gene constructs used for expression of thesepolymerases can include the corresponding shortened forms of the codingsequence.

In addition to the N-terminal deletions, individual amino acid residuesin the peptide chain of Tma DNA polymerase or other thermostable DNApolymerases may be modified by oxidation, reduction, or otherderivation, and the protein may be cleaved to obtain fragments thatretain polymerase activity but have attenuated 5' to 3' exonucleaseactivity. Modifications to the primary structure of the Tma DNApolymerase coding sequence or the coding sequences of other thermostableDNA polymerases by deletion, addition, or alteration so as to change theamino acids incorporated into the thermostable DNA polymerase duringtranslation of the mRNA produced from that coding sequence can be madewithout destroying the high temperature DNA polymerase activity of theprotein.

Another technique for preparing thermostable DNA polymerases containingnovel properties such as reduced or enhanced 5' to 3' exonucleaseactivity is a "domain shuffling" technique for the construction of"thermostable chimeric DNA polymerases". For example, substitution ofthe Tma DNA polymerase coding sequence comprising codons about 291through about 484 for the Taq DNA polymerase I codons 289-422 wouldyield a novel thermostable DNA polymerase containing the 5' to 3'exonuclease domain of Taq DNA polymerase (1-289), the 3' to 5'exonuclease domain of Tma DNA polymerase (291-484), and the DNApolymerase domain of Taq DNA polymerase (423-832). Alternatively, the 5'to 3' exonuclease domain and the 3' to 5' exonuclease domains of Tma DNApolymerase (ca. codons 1-484) may be fused to the DNA polymerase (dNTPbinding and primer/template binding domains) portions of Taq DNApolymerase (ca. codons 423-832).

As is apparent, the donors and recipients for the creation of"thermostable chimeric DNA polymerase" by "domain shuffling" need not belimited to Taq and Tma DNA polymerases. Other thermostable polymerasesprovide analogous domains as Taq and Tma DNA polymerases. Furthermore,the 5' to 3' exonuclease domain may derive from a thermostable DNApolymerase with altered 5' to 3' nuclease activity. For example, the 1to 289 5' to 3' nuclease domain of Taq DNA polymerase may derive from aGly (46) to Asp mutant form of the Taq polymerase gene. Similarly, the5' to 3' nuclease and 3' to 5' nuclease domains of Tma DNA polymerasemay encode a 5' to 3' exonuclease deficient domain, and be retrieved asa Tma Gly (37) to Asp amino acid 1 to 484 encoding DNA fragment oralternatively a truncated Met 140 to amino acid 484 encoding DNAfragment.

While any of a variety of means may be used to generate chimeric DNApolymerase, coding sequences (possessing novel properties), a preferredmethod employs "overlap" PCR. In this method, the intended junctionsequence is designed into the PCR primers (at their 5'-ends). Followingthe initial amplification of the individual domains, the variousproducts are diluted (ca. 100 to 1000-fold) and combined, denatured,annealed, extended, and then the final forward and reverse primers areadded for an otherwise standard PCR.

Those of skill in the art recognize that the above thermostable DNApolymerases with attenuated 5' to 3' exonuclease activity are mosteasily constructed by recombinant DNA techniques. When one desires toproduce one of the mutant enzymes of the present invention, withattenuated 5' to 3' exonuclease activity or a derivative or homologue ofthose enzymes, the production of a recombinant form of the enzymetypically involves the construction of an expression vector, thetransformation of a host cell with the vector, and culture of thetransformed host cell under conditions such that expression will occur.

To construct the expression vector, a DNA is obtained that encodes themature (used here to include all chimeras or muteins) enzyme or a fusionof the mutant polymerase to an additional sequence that does not destroyactivity or to an additional sequence cleavable under controlledconditions (such as treatment with peptidase) to give an active protein.The coding sequence is then placed in operable linkage with suitablecontrol sequences in an expression vector. The vector can be designed toreplicate autonomously in the host cell or to integrate into thechromosomal DNA of the host cell. The vector is used to transform asuitable host, and the transformed host is cultured under conditionssuitable for expression of the recombinant polymerase.

Each of the foregoing steps can be done in a variety of ways. Forexample, the desired coding sequence may be obtained from genomicfragments and used directly in appropriate hosts. The construction forexpression vectors operable in a variety of hosts is made usingappropriate replicons and control sequences, as set forth generallybelow. Construction of suitable vectors containing the desired codingand control sequences employs standard ligation and restrictiontechniques that are well understood in the art. Isolated plasmids, DNAsequences, or synthesized oligonucleotides are cleaved, modified, andreligated in the form desired. Suitable restriction sites can, if notnormally available, be added to the ends of the coding sequence so as tofacilitate construction of an expression vector, as exemplified below.

Site-specific DNA cleavage is performed by treating with suitablerestriction enzyme (or enzymes) under conditions that are generallyunderstood in the art and specified by the manufacturers of commerciallyavailable restriction enzymes. See, e.g., New England Biolabs, ProductCatalog. In general, about 1 μg of plasmid or other DNA is cleaved byone unit of enzyme in about 20 μl of buffer solution; in the examplesbelow, an excess of restriction enzyme is generally used to ensurecomplete digestion of the DNA. Incubation times of about one to twohours at about 37° C. are typical, although variations can be tolerated.After each incubation, protein is removed by extraction with phenol andchloroform; this extraction can be followed by ether extraction andrecovery of the DNA from aqueous fractions by precipitation withethanol. If desired, size separation of the cleaved fragments may beperformed by polyacrylamide gel or agarose gel electrophoresis usingstandard techniques. See, e.g., Methods in Enzymology, 1980, 65:499-560.

Restriction-cleaved fragments with single-strand "overhanging" terminican be made blunt-ended (double-strand ends) by treating with the largefragment of E. coli DNA polymerase I (Klenow) in the presence of thefour deoxynucleoside triphosphates (dNTPs) using incubation times ofabout 15 to 25 minutes at 20° C. to 25° C. in 50 mM Tris-Cl pH 7.6, 50mM NaCl, 10 mM MgCl₂, 10 mM DTT, and 5 to 10 μM dNTPs. The Klenowfragment fills in at 5' protruding ends, but chews back protruding 3'single strands, even though the four dNTPs are present. If desired,selective repair can be performed by supplying only one of the, orselected, dNTPs within the limitations dictated by the nature of theprotruding ends. After treatment with Klenow, the mixture is extractedwith phenol/chloroform and ethanol precipitated. Similar results can beachieved using S1 nuclease, because treatment under appropriateconditions with S1 nuclease results in hydrolysis of any single-strandedportion of a nucleic acid.

Synthetic oligonucleotides can be prepared using the triester method ofMatteucci et al., 1981, J. Am. Chem. Soc. 103:3185-3191, or automatedsynthesis methods. Kinasing of single strands prior to annealing or forlabeling is achieved using an excess, e.g., approximately 10 units, ofpolynucleotide kinase to 0.5 μM substrate in the presence of 50 mM Tris,pH 7.6, 10 mM MgCl₂, 5 mM dithiothreitol (DTT), and 1 to 2 μM ATP. Ifkinasing is for labeling of probe, the ATP will contain high specificactivity γ-³² p.

Ligations are performed in 15-30 μl volumes under the following standardconditions and temperatures: 20 mM Tris-Cl, pH 7.5, 10 mM MgCl₂, 10 mMDTT, 33 μg/ml BSA, 10 mM-50 mM NaCl, and either 40 μM ATP and 0.01-0.02(Weiss) units T4 DNA ligase at 0° C. (for ligation of fragments withcomplementary single-stranded ends) or 1 mM ATP and 0.3-0.6 units T4 DNAligase at 14° C. (for "blunt end" ligation). Intermolecular ligations offragments with complementary ends are usually performed at 33-100 μg/mltotal DNA concentrations (5 to 100 nM total ends concentration).Intermolecular blunt end ligations (usually employing a 20 to 30 foldmolar excess of linkers, optionally) are performed at 1 μM total endsconcentration.

In vector construction, the vector fragment is commonly treated withbacterial or calf intestinal alkaline phosphatase (BAP or CIAP) toremove the 5' phosphate and prevent religation and reconstruction of thevector. BAP and CIAP digestion conditions are well known in the art, andpublished protocols usually accompany the commercially available BAP andCIAP enzymes. To recover the nucleic acid fragments, the preparation isextracted with phenol-chloroform and ethanol precipitated to remove thephosphatase and purify the DNA. Alternatively, religation of unwantedvector fragments can be prevented by restriction enzyme digestion beforeor after ligation, if appropriate restriction sites are available.

For portions of vectors or coding sequences that require sequencemodifications, a variety of site-specific primer-directed mutagenesismethods are available. The polymerase chain reaction (PCR) can be usedto perform site-specific mutagenesis. In another technique now standardin the art, a synthetic oligonucleotide encoding the desired mutation isused as a primer to direct synthesis of a complementary nucleic acidsequence of a single-stranded vector, such as pBS13+, that serves as atemplate for construction of the extension product of the mutagenizingprimer. The mutagenized DNA is transformed into a host bacterium, andcultures of the transformed bacteria are plated and identified. Theidentification of modified vectors may involve transfer of the DNA ofselected transformants to a nitrocellulose filter or other membrane andthe "lifts" hybridized with kinased synthetic primer at a temperaturethat permits hybridization of an exact match to the modified sequencebut prevents hybridization with the original strand. Transformants thatcontain DNA that hybridizes with the probe are then cultured and serveas a reservoir of the modified DNA.

In the constructions set forth below, correct ligations for plasmidconstruction are confirmed by first transforming E. coli strain DG101 oranother suitable host with the ligation mixture. Successfultransformants are selected by ampicillin, tetracycline or otherantibiotic resistance or sensitivity or by using other markers,depending on the mode of plasmid construction, as is understood in theart. Plasmids from the transformants are then prepared according to themethod of Clewell et al., 1969, Proc. Natl. Acad. Sci. USA 62:1159,optionally following chloramphenicol amplification (Clewell, 1972, J.Bacteriol. 110:667). Another method for obtaining plasmid DNA isdescribed as the "Base-Acid" extraction method at page 11 of theBethesda Research Laboratories publication Focus, volume 5, number 2,and very pure plasmid DNA can be obtained by replacing steps 12 through17 of the protocol with CsCl/ethidium bromide ultracentrifugation of theDNA. The isolated DNA is analyzed by restriction enzyme digestion and/orsequenced by the dideoxy method of Sanger et al., 1977, Proc. Natl.Acad. Sci. USA 74:5463, as further described by Messing et al., 1981,Nuc. Acids Res. 9:309, or by the method of Maxam et al., 1980, Methodsin Enzymology 65:499.

The control sequences, expression vectors, and transformation methodsare dependent on the type of host cell used to express the gene.Generally, procaryotic, yeast, insect, or mammalian cells are used ashosts. Procaryotic hosts are in general the most efficient andconvenient for the production of recombinant proteins and are thereforepreferred for the expression of the thermostable DNA polymerases of thepresent invention.

The procaryote most frequently used to express recombinant proteins isE. coli. For cloning and sequencing, and for expression of constructionsunder control of most bacterial promoters, E. coli K12 strain MM294,obtained from the E. coli Genetic Stock Center under GCSC #6135, can beused as the host. For expression vectors with the P_(L) N_(RBS) controlsequence, E. coli K12 strain MC1000 lambda lysogen, N₇ N₅₃ cI₈₅₇ SusP₈₀,ATCC 39531, may be used. E. coli DG116, which was deposited with theATCC (ATCC 53606) on Apr. 7, 1987, and E. coli KB2, which was depositedwith the ATCC (ATCC 53075) on Mar. 29, 1985, are also useful host cells.For M13 phage recombinants, E. coli strains susceptible to phageinfection, such as E. coli K12 strain DG98, are employed. The DG98strain was deposited with the ATCC (ATCC 39768 ) on Jul. 13, 1984.

However, microbial strains other than E. coli can also be used, such asbacilli, for example Bacillus subtilis, various species of Pseudomonas,and other bacterial strains, for recombinant expression of thethermostable DNA polymerases of the present invention. In suchprocaryotic systems, plasmid vectors that contain replication sites andcontrol sequences derived from the host or a species compatible with thehost are typically used.

For example, E. coli is typically transformed using derivatives ofpBR³²², described by Bolivar et al., 1977, Gene 2:95. Plasmid pBR³²²contains genes for ampicillin and tetracycline resistance. These drugresistance markers can be either retained or destroyed in constructingthe desired vector and so help to detect the presence of a desiredrecombinant. Commonly used procaryotic control sequences, i.e., apromoter for transcription initiation, optionally with an operator,along with a ribosome binding site sequence, include the β-lactamase(penicillinase) and lactose (lac) promoter systems (Chang et al., 1977,Nature 198:1056), the tryptophan (trp) promoter system (Goeddel et al.,1980, Nuc. Acids Res. 8:4057), and the lambda-derived P_(L) promoter(Shimatake et al., 1981, Nature 292:128) and N-gene ribosome bindingsite (N_(RBS)). A portable control system cassette is set forth in U.S.Pat. No. 4,711,845, issued Dec. 8, 1987. This cassette comprises a P_(L)promoter operably linked to the N_(RBS) in turn positioned upstream of athird DNA sequence having at least one restriction site that permitscleavage within six bp 3' of the N_(RBS) sequence. Also useful is thephosphatase A (phoA) system described by Chang et al. in European PatentPublication No. 196,864, published Oct. 8, 1986. However, any availablepromoter system compatible with procaryotes can be used to construct amodified thermostable DNA polymerase expression vector of the invention.

In addition to bacteria, eucaryotic microbes, such as yeast, can also beused as recombinant host cells. Laboratory strains of Saccharomycescerevisiae, Baker's yeast, are most often used, although a number ofother strains are commonly available. While vectors employing the twomicron origin of replication are common (Broach, 1983, Meth. Enz.101:307), other plasmid vectors suitable for yeast expression are known(see, for example, Stinchcomb et al., 1979, Nature 282:39; Tschempe etal., 1980, Gene 10:157; and Clarke et al., 1983, Meth. Enz. 101:300).Control sequences for yeast vectors include promoters for the synthesisof glycolytic enzymes (Hess et al., 1968, J. Adv. Enzyme Reg. 7:149;Holland et al., 1978, Biotechnology 17:4900; and Holland et al., 1981,J. Biol. Chem. 256:1385). Additional promoters known in the art includethe promoter for 3-phosphoglycerate kinase (Hitzeman et al., 1980, J.Biol. Chem. 255:2073) and those for other glycolytic enzymes, such asglyceraldehyde 3-phosphate dehydrogenase, hexokinase, pyruvatedecarboxylase, phosphofructokinase, glucose-6-phosphate isomerase,3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase,phosphoglucose isomerase, and glucokinase. Other promoters that have theadditional advantage of transcription controlled by growth conditionsare the promoter regions for alcohol dehydrogenase 2, isocytochrome C,acid phosphatase, degradative enzymes associated with nitrogenmetabolism, and enzymes responsible for maltose and galactoseutilization (Holland, supra).

Terminator sequences may also be used to enhance expression when placedat the 3' end of the coding sequence. Such terminators are found in the3' untranslated region following the coding sequences in yeast-derivedgenes. Any vector containing a yeast-compatible promoter, origin ofreplication, and other control sequences is suitable for use inconstructing yeast expression vectors for the thermostable DNApolymerases of the present invention.

The nucleotide sequences which code for the thermostable DNA polymerasesof the present invention can also be expressed in eucaryotic host cellcultures derived from multicellular organisms. See, for example, TissueCulture, Academic Press, Cruz and Patterson, editors (1973). Useful hostcell lines include COS-7, COS-A2, CV-1, murine cells such as murinemyelomas N51 and VERO, HeLa cells, and Chinese hamster ovary (CHO)cells. Expression vectors for such cells ordinarily include promotersand control sequences compatible with mammalian cells such as, forexample, the commonly used early and late promoters from Simian Virus 40(SV 40) (Fiers et al., 1978, Nature 273:113), or other viral promoterssuch as those derived from polyoma, adenovirus 2, bovine papilloma virus(BPV), or avian sarcoma viruses, or immunoglobulin promoters and heatshock promoters. A system for expressing DNA in mammalian systems usinga BPV vector system is disclosed in U.S. Pat. No. 4,419,446. Amodification of this system is described in U.S. Pat. No. 4,601,978.General aspects of mammalian cell host system transformations have beendescribed by Axel, U.S. Pat. No. 4,399,216. "Enhancer" regions are alsoimportant in optimizing expression; these are, generally, sequencesfound upstream of the promoter region. Origins of replication may beobtained, if needed, from viral sources. However, integration into thechromosome is a common mechanism for DNA replication in eucaryotes.

Plant cells can also be used as hosts, and control sequences compatiblewith plant cells, such as the nopaline synthase promoter andpolyadenylation signal sequences (Depicker et al., 1982, J. Mol. Appl.Gen. 1:561) are available. Expression systems employing insect cellsutilizing the control systems provided by baculovirus vectors have alsobeen described (Miller et al., 1986, Genetic Engineering (Setlow et al.,eds., Plenum Publishing) 8:277-297). Insect cell-based expression can beaccomplished in Spodoptera frugipeida. These systems can also be used toproduce recombinant thermostable polymerases of the present invention.

Depending on the host cell used, transformation is done using standardtechniques appropriate to such cells. The calcium treatment employingcalcium chloride, as described by Cohen, 1972, Proc. Natl. Acad. Sci.USA 69:2110 is used for procaryotes or other cells that containsubstantial cell wall barriers. Infection with Agrobacterium tumefaciens(Shaw et al., 1983, Gene 23:315) is used for certain plant cells. Formammalian cells, the calcium phosphate precipitation method of Grahamand van der Eb, 1978, Virology 52:546 is preferred. Transformations intoyeast are, carried out according to the method of Van Solingen et al.,1977, J. Bact. 130:946 and Hsiao et al., 1979, Proc. Natl. Acad. Sci.USA 76:3829.

Once the desired thermostable DNA polymerase with altered 5' to 3'exonuclease activity has been expressed in a recombinant host cell,purification of the protein may be desired. Although a variety ofpurification procedures can be used to purify the recombinantthermostable polymerases of the invention, fewer steps may be necessaryto yield an enzyme preparation of equal purity. Because E. coli hostproteins are heat-sensitive, the recombinant thermostable DNApolymerases of the invention can be substantially enriched by heatinactivating the crude lysate. This step is done in the presence of asufficient amount of salt (typically 0.2-0.3M ammonium sulfate) toensure dissociation of the thermostable DNA polymerase from the host DNAand to reduce ionic interactions of thermostable DNA polymerase withother cell lysate proteins.

In addition, the presence of 0.3M ammonium sulfate promotes hydrophobicinteraction with a phenyl sepharose column. Hydrophobic interactionchromatography is a separation technique in which substances areseparated on the basis of differing strengths of hydrophobic interactionwith an uncharged bed material containing hydrophobic groups. Typically,the column is first equilibrated under conditions favorable tohydrophobic binding, such as high ionic strength. A descending saltgradient may then be used to elute the sample.

According to the invention, an aqueous mixture (containing therecombinant thermostable DNA polymerase with altered 5' to 3'exonuclease activity) is loaded onto a column containing a relativelystrong hydrophobic gel such as phenyl sepharose (manufactured byPharmacia) or Phenyl TSK (manufactured by Toyo Soda). To promotehydrophobic interaction with a phenyl sepharose column, a solvent isused that contains, for example, greater than or equal to 0.3M ammoniumsulfate, with 0.3M being preferred, or greater than or equal to 0.5MNaCl. The column and the sample are adjusted to 0.3M ammonium sulfate in50 mM Tris (pH 7.5) and 1.0 mM EDTA ("TE") buffer that also contains 0.5mM DTT, and the sample is applied to the column. The column is washedwith the 0.3M ammonium sulfate buffer. The enzyme may then be elutedwith solvents that attenuate hydrophobic interactions, such asdecreasing salt gradients, ethylene or propylene glycol, or urea.

For long-term stability, the thermostable DNA polymerase enzymes of thepresent invention can be stored in a buffer that contains one or morenon-ionic polymeric detergents. Such detergents are generally those thathave a molecular weight in the range of approximately 100 to 250,000daltons, preferably about 4,000 to 200,000 daltons, and stabilize theenzyme at a pH of from about 3.5 to about 9.5, preferably from about 4to 8.5. Examples of such detergents include those specified on pages295-298 of McCutcheon's Emulsifiers & Detergents, North American edition(1983), published by the McCutcheon Division of MC Publishing Co., 175Rock Road, Glen Rock, N.J. (USA) and Ser. No. 387,003, filed Jul. 28,1989, now abandoned in favor of continuation application U.S. Ser. No.07/873,897, filed Apr. 24, 1992, each of which is incorporated herein byreference.

Preferably, the detergents are selected from the group comprisingethoxylated fatty alcohol ethers and lauryl ethers, ethoxylated alkylphenols, octylphenoxy polyethoxy ethanol compounds, modifiedoxyethylated and/or oxypropylated straight-chain alcohols, polyethyleneglycol monooleate compounds, polysorbate compounds, and phenolic fattyalcohol ethers. More particularly preferred are Tween 20, apolyoxyethylated (20) sorbitan monolaurate from ICI Americas Inc.,Wilmington, Del., and Iconol NP-40, an ethoxylated alkyl phenol (nonyl)from BASF Wyandotte Corp., Parsippany, N.J.

The thermostable enzymes of this invention may be used for any purposein which such enzyme activity is necessary or desired.

DNA sequencing by the Sanger dideoxynucleotide method (Sanger et al.,1977, Proc. Natl. Acad. Sci. USA 74:5463-5467) has undergone significantrefinement in recent years, including the development of novel vectors(Yanisch-Perron et al., 1985, Gene 33:103-119), base analogs (Mills etal., 1979, Proc. Natl. Acad. Sci. USA 76:2232-2235, and Barr et al.,1986, BioTechniques 4:428-432), enzymes (Tabor et al., 1987, Proc. Natl.Acad. Sci. USA 84:4763-4771, and Innis, M. A. et al., 1988, Proc. Natl.Acad. Sci. USA 85:9436:9440), and instruments for partial automation ofDNA sequence analysis (Smith et al., 1986, Nature 321:674-679; Prober etal., 1987, Science 238:336-341; and Ansorge et al., 1987, Nuc. AcidsRes. 15:4593-4602). The basic dideoxy sequencing procedure involves (i)annealing an oligonucleotide primer to a suitable single or denatureddouble stranded DNA template; (ii) extending the primer with DNApolymerase in four separate reactions, each containing one m-labeleddNTP or ddNTP (alternatively, a labeled primer can be used), a mixtureof unlabeled dNTPs, and one chain-terminatingdideoxynucleotide-5'-triphosphate (ddNTP); (iii) resolving the four setsof reaction products on a high-resolution polyacrylamide-urea gel; and(iv) producing an autoradiographic image of the gel that can be examinedto infer the DNA sequence. Alternatively, fluorescently labeled primersor nucleotides can be used to identify the reaction products. Knowndideoxy sequencing methods utilize a DNA polymerase such as the Klenowfragment of E. coli DNA polymerase I, reverse transcriptase, Taq DNApolymerase, or a modified T7 DNA polymerase.

The introduction of commercial kits has vastly simplified the art,making DNA sequencing a routine technique for any laboratory. However,there is still a need in the art for sequencing protocols that work wellwith nucleic acids that contain secondary structure such as palindromichairpin loops and with G+C-rich DNA. Single stranded DNAs can formsecondary structure, such as a hairpin loop, that can seriouslyinterfere with a dideoxy sequencing protocol, both through impropertermination in the extension reaction, or in the case of an enzyme with5' to 3' exonuclease activity, cleavage of the template strand at thejuncture of the hairpin. Since high temperature destabilizes secondarystructure, the ability to conduct the extension reaction at a hightemperature, i.e., 70°-75° C., with a thermostable DNA polymeraseresults in a significant improvement in the sequencing of DNA thatcontains such secondary structure. However, temperatures compatible withpolymerase extension do not eliminate all secondary structure. A 5' to3' exonuclease-deficient thermostable DNA polymerase would be a furtherimprovement in the art, since the polymerase could synthesize throughthe hairpin in a strand displacement reaction, rather than cleaving thetemplate, resulting in an improper termination, i.e., an extensionrun-off fragment.

As an alternative to basic dideoxy sequencing, cycle dideoxy sequencingis a linear, asymmetric amplification of target sequences in thepresence of dideoxy chain terminators. A single cycle produces a familyof extension products of all possible lengths. Following denaturation ofthe extension reaction product from the DNA template, multiple cycles ofprimer annealing and primer extension occur in the presence of dideoxyterminators. The process is distinct from PCR in that only one primer isused, the accumulation of the sequencing reaction products in each cycleis linear, and the amplification products are heterogeneous in lengthand do not serve as template for the next reaction. Cycle dideoxysequencing is a technique providing advantages for laboratories usingautomated DNA sequencing instruments and for other high volumesequencing laboratories. It is possible to directly sequence genomicDNA, without cloning, due to the specificity of the technique and theincreased amount of signal generated. Cycle sequencing protocolsaccommodate single and double stranded templates, including genomic,cloned, and PCR-amplified templates.

Thermostable DNA polymerases have several advantages in cyclesequencing: they tolerate the stringent annealing temperatures which arerequired for specific hybridization of primer to genomic targets as wellas tolerating the multiple cycles of high temperature denaturation whichoccur in each cycle. Performing the extension reaction at hightemperatures, i.e., 70°-75° C., results in a significant improvement insequencing results with DNA that contains secondary structure, due tothe destabilization of secondary structure. However, such temperatureswill not eliminate all secondary structure. A 5' to 3'exonuclease-deficient thermostable DNA polymerase would be a furtherimprovement in the art, since the polymerase could synthesize throughthe hairpin in a strand displacement reaction, rather than cleaving thetemplate and creating an improper termination. Additionally, like PCR,cycle sequencing suffers from the phenomenon of product strandrenaturation. In the case of a thermostable DNA polymerase possessing 5'to 3' exonuclease activity, extension of a primer into a double strandedregion created by product strand renaturation will result in cleavage ofthe renatured complementary product strand. The cleaved strand will beshorter and thus appear as an improper termination. In addition, thecorrect, previously synthesized termination signal will be attenuated. Athermostable DNA polymerase deficient in 5' to 3' exonuclease activitywill improve the art, in that such extension product fragments will notbe formed. A variation of cycle sequencing, involves the simultaneousgeneration of sequencing ladders for each strand of a double strandedtemplate while sustaining some degree of amplification (Ruano and Kidd,Proc. Natl. Acad. Sci. USA, 1991 88:2815-2819). This method of coupledamplification and sequencing would benefit in a similar fashion asstandard cycle sequencing from the use of a thermostable DNA polymerasedeficient in 5' to 3' exonuclease activity.

In a particularly preferred embodiment, the enzymes in which the 5' to3' exonuclease activity has been reduced or eliminated catalyze thenucleic acid amplification reaction known as PCR, and as stated above,with the resultant effect of producing a better yield of desired productthan is achieved with the respective native enzymes which have greateramounts of the 5' to 3' exonuclease activity. Improved yields are theresult of the inability to degrade previously synthesized product causedby 5' to 3' exonuclease activity. This process for amplifying nucleicacid sequences is disclosed and claimed in U.S. Pat. Nos. 4,683,202 and4,865,188, each of which is incorporated herein by reference. The PCRnucleic acid amplification method involves amplifying at least onespecific nucleic acid sequence contained in a nucleic acid or a mixtureof nucleic acids and in the most common embodiment, producesdouble-stranded DNA. Aside from improved yields, thermostable DNApolymerases with attenuated 5' to 3' exonuclease activity exhibit animproved ability to generate longer PCR products, an improved ability toproduce products from G+C-rich templates and an improved ability togenerate PCR products and DNA sequencing ladders from templates with ahigh degree of secondary structure.

For ease of discussion, the protocol set forth below assumes that thespecific sequence to be amplified is contained in a double-strandednucleic acid. However, the process is equally useful in amplifyingsingle-stranded nucleic acid, such as mRNA, although in the preferredembodiment the ultimate product is still double-stranded DNA. In theamplification of a single-stranded nucleic acid, the first step involvesthe synthesis of a complementary strand (one of the two amplificationprimers can be used for this purpose), and the succeeding steps proceedas in the double-stranded amplification process described below.

This amplification process comprises the steps of:

(a) contacting each nucleic acid strand with four different nucleosidetriphosphates and two oligonucleotide primers for each specific sequencebeing amplified, wherein each primer is selected to be substantiallycomplementary to the different strands of the specific sequence, suchthat the extension product synthesized from one primer, when separatedfrom its complement, can serve as a template for synthesis of theextension product of the other primer, said contacting being at atemperature that allows hybridization of each primer to a complementarynucleic acid strand;

(b) contacting each nucleic acid strand, at the same time as or afterstep (a), with a thermostable DNA polymerase of the present inventionthat enables combination of the nucleoside triphosphates to form primerextension products complementary to each strand of the specific nucleicacid sequence;

(c) maintaining the mixture from step (b) at an effective temperaturefor an effective time to promote the activity of the enzyme and tosynthesize, for each different sequence being amplified, an extensionproduct of each primer that is complementary to each nucleic acid strandtemplate, but not so high as to separate each' extension product fromthe complementary strand template;

(d) heating the mixture from step (c) for an effective time and at aneffective temperature to separate the primer extension products from thetemplates on which they were synthesized to produce single-strandedmolecules but not so high as to denature irreversibly the enzyme;

(e) cooling the mixture from step (d) for an effective time and to aneffective temperature to promote hybridization of a primer to each ofthe single-stranded molecules produced in step (d); and

(f) maintaining the mixture from step (e) at an effective temperaturefor an effective time to promote the activity of the enzyme and tosynthesize, for each different sequence being amplified, an extensionproduct of each primer that is complementary to each nucleic acidtemplate produced in step (d) but not so high as to separate eachextension product from the complementary strand template. The effectivetimes and temperatures in steps (e) and (f) may coincide, so that steps(e) and (f) can be carried out simultaneously. Steps (d)-(f) arerepeated until the desired level of amplification is obtained.

The amplification method is useful not only for producing large amountsof a specific nucleic acid sequence of known sequence but also forproducing nucleic acid sequences that are known to exist but are notcompletely specified. One need know only a sufficient number of bases atboth ends of the sequence in sufficient detail so that twooligonucleotide primers can be prepared that will hybridize to differentstrands of the desired sequence at relative positions along the sequencesuch that an extension product synthesized from one primer, whenseparated from the template (complement), can serve as a template forextension of the other primer into a nucleic acid sequence of definedlength. The greater the knowledge about the bases at both ends of thesequence, the greater can be the specificity of the primers for thetarget nucleic acid sequence and the efficiency of the process andspecificity of the reaction.

In any case, an initial copy of the sequence to be amplified must beavailable, although the sequence need not be pure or a discretemolecule. In general, the amplification process involves a chainreaction for producing, in exponential quantities relative to the numberof reaction steps involved, at least one specific nucleic acid sequencegiven that (a) the ends of the required sequence are known in sufficientdetail that oligonucleotides can be synthesized that will hybridize tothem and (b) that a small amount of the sequence is available toinitiate the chain reaction. The product of the chain reaction will be adiscrete nucleic acid duplex with termini corresponding to the 5' endsof the specific primers employed.

Any nucleic acid sequence, in purified or nonpurified form, can beutilized as the starting nucleic acid(s), provided it contains or issuspected to contain the specific nucleic acid sequence one desires toamplify. The nucleic acid to be amplified can be obtained from anysource, for example, from plasmids such as pBR322, from cloned DNA orRNA, or from natural DNA or RNA from any source, including bacteria,yeast, viruses, organelles, and higher organisms such as plants andanimals. DNA or RNA may be extracted from blood, tissue material such aschorionic villi, or amniotic cells by a variety of techniques. See,e.g., Maniatis et al., 1982, Molecular Cloning: A Laboratory Manual(Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.) pp. 280-281.Thus, the process may employ, for example, DNA or RNA, includingmessenger RNA, which DNA or RNA may be single-stranded ordouble-stranded. In addition, a DNA-RNA hybrid that contains one strandof each may be utilized. A mixture of any of these nucleic acids canalso be employed as can nucleic acids produced from a previousamplification reaction (using the same or different primers). Thespecific nucleic acid sequence to be amplified can be only a fraction ofa large molecule or can be present initially as a discrete molecule, sothat the specific sequence constitutes the entire nucleic acid.

The sequence to be amplified need not be present initially in a pureform; the sequence can be a minor fraction of a complex mixture, such asa portion of the β-globin gene contained in whole human DNA (asexemplified in Saiki et al., 1985, Science 290:1530-1534) or a portionof a nucleic acid sequence due to a particular microorganism, whichorganism might constitute only a very minor fraction of a particularbiological sample. The cells can be directly used in the amplificationprocess after suspension in hypotonic buffer and heat treatment at about90° C.-100° C. until cell lysis and dispersion of intracellularcomponents occur (generally 1 to 15 minutes). After the heating step,the amplification reagents may be added directly to the lysed cells. Thestarting nucleic acid sequence can contain more than one desiredspecific nucleic acid sequence. The amplification process is useful notonly for producing large amounts of one specific nucleic acid sequencebut also for amplifying simultaneously more than one different specificnucleic acid sequence located on the same or different nucleic acidmolecules.

Primers play a key role in the PCR process. The word "primer" as used indescribing the amplification process can refer to more than one primer,particularly in the case where there is some ambiguity in theinformation regarding the terminal sequence(s) of the fragment to beamplified or where one employs the degenerate primer process describedin PCT Application No. 91/05753, filed Aug. 13, 1991, which published onMar. 5, 1992. For instance, in the case where a nucleic acid sequence isinferred from protein sequence information, a collection of primerscontaining sequences representing all possible codon variations based ondegeneracy of the genetic code can be used for each strand. One primerfrom this collection will be sufficiently homologous with a portion ofthe desired sequence to be amplified so as to be useful foramplification.

In addition, more than one specific nucleic acid sequence can beamplified from the first nucleic acid or mixture of nucleic acids, solong as the appropriate number of different oligonucleotide primers areutilized. For example, if two different specific nucleic acid sequencesare to be produced, four primers are utilized. Two of the primers arespecific for one of the specific nucleic acid sequences, and the othertwo primers are specific for the second specific nucleic acid sequence.In this manner, each of the two different specific sequences can beproduced exponentially by the present process.

A sequence within a given sequence can be amplified after a given numberof amplification cycles to obtain greater specificity in the reaction byadding, after at least one cycle of amplification, a set of primers thatare complementary to internal sequences (i.e., sequences that are not onthe ends) of the sequence to be amplified. Such primers can be added atany stage and will provide a shorter amplified fragment. Alternatively,a longer fragment can be prepared by using primers withnon-complementary ends but having some overlap with the primerspreviously utilized in the amplification.

Primers also play a key role when the amplification process is used forin vitro mutagenesis. The product of an amplification reaction where theprimers employed are not exactly complementary to the original templatewill contain the sequence of the primer rather than the template, sointroducing an in vitro mutation. In further cycles, this mutation willbe amplified with an undiminished efficiency because no furthermispaired priming is required. The process of making an altered DNAsequence as described above could be repeated on the altered DNA usingdifferent primers to induce further sequence changes. In this way, aseries of mutated sequences can gradually be produced wherein each newaddition to the series differs from the last in a minor way, but fromthe original DNA source sequence in an increasingly major way.

Because the primer can contain as part of its sequence anon-complementary sequence, provided that a sufficient amount of theprimer contains a sequence that is complementary to the strand to beamplified, many other advantages can be realized. For example, anucleotide sequence that is not complementary to the template sequence(such as, e.g., a promoter, linker, coding sequence, etc.) may beattached at the 5' end of one or both of the primers and so appended tothe product of the amplification process. After the extension primer isadded, sufficient cycles are run to achieve the desired amount of newtemplate containing the non-complementary nucleotide insert. This allowsproduction of large quantities of the combined fragments in a relativelyshort period of time (e.g., two hours or less) using a simple technique.

Oligonucleotide primers can be prepared using any suitable method, suchas, for example, the phosphotriester and phosphodiester methodsdescribed above, or automated embodiments thereof. In one such automatedembodiment, diethylphosphoramidites are used as starting materials andcan be synthesized as described by Beaucage et al., 1981, TetrahedronLetters 22:1859-1862. One method for synthesizing oligonucleotides on amodified solid support is described in U.S. Pat. No. 4,458,066. One canalso use a primer that has been isolated from a biological source (suchas a restriction endonuclease digest).

No matter what primers are used, however, the reaction mixture mustcontain a template for PCR to occur, because the specific nucleic acidsequence is produced by using a nucleic acid containing that sequence asa template. The first step involves contacting each nucleic acid strandwith four different nucleoside triphosphates and two oligonucleotideprimers for each specific nucleic acid sequence being amplified ordetected. If the nucleic acids to be amplified or detected are DNA, thenthe nucleoside triphosphates are usually dATP, dCTP, dGTP, and dTTP,although various nucleotide derivatives can also be used in the process.For example, when using PCR for the detection of a known sequence in asample of unknown sequences, dTTP is often replaced by dUTP in order toreduce contamination between samples as taught in PCT Application No.91/05210 filed Jul. 23, 1991, which published on Feb. 6, 1992,incorporated herein by reference.

The concentration of nucleoside triphosphates can vary widely.Typically, the concentration is 50 to 200 μM in each dNTP of the bufferfor amplification, and MgCl₂ is present in the buffer in an amount of 1to 3 mM to activate the polymerase and increase the specificity of thereaction. However, dNTP concentrations of 1 to 20 μM may be preferredfor some applications, such as DNA sequencing or generating radiolabeledprobes at high specific activity.

The nucleic acid strands of the target nucleic acid serve as templatesfor the synthesis of additional nucleic acid strands, which areextension products of the primers. This synthesis can be performed usingany suitable method, but generally occurs in a buffered aqueoussolution, preferably at a pH of 7 to 9, most preferably about 8. Tofacilitate synthesis, a molar excess of the two oligonucleotide primersis added to the buffer containing the template strands. As a practicalmatter, the amount of primer added will generally be in molar excessover the amount of complementary strand (template) when the sequence tobe amplified is contained in a mixture of complicated long-chain nucleicacid strands. A large molar excess is preferred to improve theefficiency of the process. Accordingly, primer:template ratios of atleast 1000:1 or higher are generally employed for cloned DNA templates,and primer: template ratios of about 10⁸ :1 or higher are generallyemployed for amplification from complex genomic samples.

The mixture of template, primers, and nucleoside triphosphates is thentreated according to whether the nucleic acids being amplified ordetected are double- or single-stranded. If the nucleic acids aresingle-stranded, then no denaturation step need be employed prior to thefirst extension cycle, and the reaction mixture is held at a temperaturethat promotes hybridization of the primer to its complementary target(template) sequence. Such temperature is generally from about 35° C. to65° C. or more, preferably about 37° C. to 60° C. for an effective time,generally from a few seconds to five minutes, preferably from 30 secondsto one minute. A hybridization temperature of 35° C. to 70° C. may beused for 5' to 3' exonuclease mutant thermostable DNA polymerases.Primers that are 15 nucleotides or longer in length are used to increasethe specificity of primer hybridization. Shorter primers require lowerhybridization temperatures.

The complement to the original single-stranded nucleic acids can besynthesized by adding the thermostable DNA polymerase of the presentinvention in the presence of the appropriate buffer, dNTPs, and one ormore oligonucleotide primers. If an appropriate single primer is added,the primer extension product will be complementary to thesingle-stranded nucleic acid and will be hybridized with the nucleicacid strand in a duplex of strands of equal or unequal length (dependingon where the primer hybridizes to the template), which may then beseparated into single strands as described above to produce two single,separated, complementary strands. A second primer would then be added sothat subsequent cycles of primer extension would occur using both theoriginal single-stranded nucleic acid and the extension product of thefirst primer as templates. Alternatively, two or more appropriateprimers (one of which will prime synthesis using the extension productof the other primer as a template) can be added to the single-strandednucleic acid and the reaction carried out.

If the nucleic acid contains two strands, as in the case ofamplification of a double-stranded target or second-cycle amplificationof a single-stranded target, the strands of nucleic acid must beseparated before the primers are hybridized. This strand separation canbe accomplished by any suitable denaturing method, including physical,chemical or enzymatic means. One preferred physical method of separatingthe strands of the nucleic acid involves heating the nucleic acid untilcomplete (>99%) denaturation occurs. Typical heat denaturation involvestemperatures ranging from about 80° C. to 105° C. for times generallyranging from about a few seconds to minutes, depending on thecomposition and size of the nucleic acid. Preferably, the effectivedenaturing temperature is 90° C.-100° C. for a few seconds to 1 minute.Strand separation may also be induced by an enzyme from the class ofenzymes known as helicases or the enzyme RecA, which has helicaseactivity and in the presence of ATP is known to denature DNA. Thereaction conditions suitable for separating the strands of nucleic acidswith helicases are described by Kuhn Hoffmann-Berling, 1978,CSH-Quantitative Biology 43:63, and techniques for using RecA arereviewed in Radding, 1982, Ann. Rev. Genetics 16:405-437. Thedenaturation produces two separated complementary strands of equal orunequal length.

If the double-stranded nucleic acid is denatured by heat, the reactionmixture is allowed to cool to a temperature that promotes hybridizationof each primer to the complementary target (template) sequence. Thistemperature is usually from about 35° C. to 65° C. or more, depending onreagents, preferably 37° C. to 60° C. The hybridization temperature ismaintained for an effective time, generally a few seconds to minutes,and preferably 10 seconds to 1 minute. In practical terms, thetemperature is simply lowered from about 95° C. to as low as 37° C., andhybridization occurs at a temperature within this range.

Whether the nucleic acid is single- or double-stranded, the thermostableDNA polymerase of the present invention can be added prior to or duringthe denaturation step or when the temperature is being reduced to or isin the range for promoting hybridization. Although the thermostabilityof the polymerases of the invention allows one to add such polymerasesto the reaction mixture at any time, one can substantially inhibitnon-specific amplification by adding the polymerase to the reactionmixture at a point in time when the mixture will not be cooled below thestringent hybridization temperature. After hybridization, the reactionmixture is then heated to or maintained at a temperature at which theactivity of the enzyme is promoted or optimized, i.e., a temperaturesufficient to increase the activity of the enzyme in facilitatingsynthesis of the primer extension products from the hybridized primerand template. The temperature must actually be sufficient to synthesizean extension product of each primer that is complementary to eachnucleic acid template, but must not be so high as to denature eachextension product from its complementary template (i.e., the temperatureis generally less than about 80° C. to 90° C.).

Depending on the nucleic acid(s) employed, the typical temperatureeffective for this synthesis reaction generally ranges from about 40° C.to 80° C., preferably 50° C. to 75° C. The temperature more preferablyranges from about 65° C. to 75° C. for the thermostable DNA polymerasesof the present invention. The period of time required for this synthesismay range from about 10 seconds to several minutes or more, dependingmainly on the temperature, the length of the nucleic acid, the enzyme,and the complexity of the nucleic acid mixture. The extension time isusually about 30 seconds to a few minutes. If the nucleic acid islonger, a longer time period is generally required for complementarystrand synthesis.

The newly synthesized strand and the complement nucleic acid strand forma double-stranded molecule that is used in the succeeding steps of theamplification process. In the next step, the strands of thedouble-stranded molecule are separated by heat denaturation at atemperature and for a time effective to denature the molecule, but notat a temperature and for a period so long that the thermostable enzymeis completely and irreversibly denatured or inactivated. After thisdenaturation of template, the temperature is decreased to a level thatpromotes hybridization of the primer to the complementarysingle-stranded molecule (template) produced from the previous step, asdescribed above.

After this hybridization step, or concurrently with the hybridizationstep, the temperature is adjusted to a temperature that is effective topromote the activity of the thermostable enzyme to enable synthesis of aprimer extension product using as a template both the newly synthesizedand the original strands. The temperature again must not be so high asto separate (denature) the extension product from its template, asdescribed above. Hybridization may occur during this step, so that theprevious step of cooling after denaturation is not required. In such acase, using simultaneous steps, the preferred temperature range is 50°C. to 70° C.

The heating and cooling steps involved in one cycle of strandseparation, hybridization, and extension product synthesis can berepeated as many times as needed to produce the desired quantity of thespecific nucleic acid sequence. The only limitation is the amount of theprimers, thermostable enzyme, and nucleoside triphosphates present.Usually, from 15 to 30 cycles are completed. For diagnostic detection ofamplified DNA, the number of cycles will depend on the nature of thesample, the initial target concentration in the sample and thesensitivity of the detection process used after amplification. For agiven sensitivity of detection, fewer cycles will be required if thesample being amplified is pure and the initial target concentration ishigh. If the sample is a complex mixture of nucleic acids and theinitial target concentration is low, more cycles will be required toamplify the signal sufficiently for detection. For general amplificationand detection, the process is repeated about 15 times. Whenamplification is used to generate sequences to be detected with labeledsequence-specific probes and when human genomic DNA is the target ofamplification, the process is repeated 15 to 30 times to amplify thesequence sufficiently so that a clearly detectable signal is produced,i.e., so that background noise does not interfere with detection.

No additional nucleotides, primers, or thermostable enzyme need be addedafter the initial addition, provided that no key reagent has beenexhausted and that the enzyme has not become denatured or irreversiblyinactivated, in which case additional polymerase or other reagent wouldhave to be added for the reaction to continue. After the appropriatenumber of cycles has been completed to produce the desired amount of thespecific nucleic acid sequence, the reaction can be halted in the usualmanner, e.g., by inactivating the enzyme by adding EDTA, phenol, SDS, orCHCl₃ or by separating the components of the reaction.

The amplification process can be conducted continuously. In oneembodiment of an automated process, the reaction mixture can betemperature cycled such that the temperature is programmed to becontrolled at a certain level for a certain time. One such instrumentfor this purpose is the automated machine for handling the amplificationreaction developed and marketed by Perkin-Elmer Cetus Instruments.Detailed instructions for carrying out PCR with the instrument areavailable upon purchase of the instrument.

The thermostable DNA polymerases of the present invention with altered5' to 3' exonuclease activity are very useful in the diverse processesin which amplification of a nucleic acid sequence by PCR is useful. Theamplification method may be utilized to clone a particular nucleic acidsequence for insertion into a suitable expression vector, as describedin U.S. Pat. No. 4,800,159. The vector may be used to transform anappropriate host organism to produce the gene product of the sequence bystandard methods of recombinant DNA technology. Such cloning may involvedirect ligation into a vector using blunt-end ligation, or use ofrestriction enzymes to cleave at sites contained within the primers.Other processes suitable for the thermostable DNA polymerases of thepresent invention include those described in U.S. Pat. Nos. 4,683,195and 4,683,202 and European Patent Publication Nos. 229,701; 237,362; and258,017; these patents and publications are incorporated herein byreference. In addition, the present enzyme is useful in asymmetric PCR(see Gyllensten and Erlich, 1988, Proc. Natl. Acad. Sci. USA85:7652-7656, incorporated herein by reference); inverse PCR (Ochman etal., 1988, Genetics 120:621, incorporated herein by reference); and forDNA sequencing (see Innis et al., 1988, Proc. Natl. Acad. Sci. USA85:9436-9440, and McConlogue et al., 1988, Nuc. Acids Res. 16(20):9869),random amplification of cDNA ends (RACE), random priming PCR which isused to amplify a series of DNA fragments, and PCR processes with singlesided specificity such as anchor PCR and ligation-mediated anchor PCR asdescribed by Loh, E. in METHODS: A Companion to Methods in Enzymology(1991) 2: pp. 11-19.

An additional process in which a 5' to 3' exonuclease deficientthermostable DNA polymerase would be useful is a process referred to aspolymerase ligase chain reaction (PLCR). As its name suggests, thisprocess combines features of PCR with features of ligase chain reaction(LCR).

PLCR was developed in part as a technique to increase the specificity ofallele-specific PCR in which the low concentrations of dNTPs utilized(˜1 μM) limited the extent of amplification. In PLCR, DNA is denaturedand four complementary, but not adjacent, oligonucleotide primers areadded with dNTPs, a thermostable DNA polymerase and a thermostableligase.

The primers anneal to target DNA in a non-adjacent fashion and thethermostable DNA polymerase causes the addition of appropriate dNTPs tothe 3' end of the downstream primer to fill the gap between thenon-adjacent primers and thus render the primers adjacent. Thethermostable ligase will, then ligate the two adjacent oligonucleotideprimers.

However, the presence of 5' to 3' exonuclease activity in thethermostable DNA polymerase significantly decreases the probability ofclosing the gap between the two primers because such activity causes theexcision of nucleotides or small oligonucleotides from the 5' end of thedownstream primer thus preventing ligation of the primers. Therefore, athermostable DNA polymerase with attenuated or eliminated 5' to 3'exonuclease activity would be particularly useful in PLCR.

Briefly, the thermostable DNA polymerases of the present invention whichhave been mutated to have reduced, attenuated or eliminated 5' to 3'exonuclease activity are useful for the same procedures and techniquesas their respective non-mutated polymerases except for procedures andtechniques which require 5' to 3' exonuclease activity such as thehomogeneous assay technique discussed below. Moreover, the mutated DNApolymerases of the present invention will oftentimes result in moreefficient performance of the procedures and techniques due to thereduction or elimination of the inherent 5' to 3' exonuclease activity.

Specific thermostable DNA polymerases with attenuated 5' to 3'exonuclease activity include the following mutated forms of Taq, Tma,Tsps17, TZ05, Tth and Taf DNA polymerases. In the table below, andthroughout the specification, deletion mutations are inclusive of thenumbered nucleotides or amino acids which define the deletion.

    ______________________________________                                        DNA                         Mutant                                            Polymerase                                                                              Mutation          Designation                                       ______________________________________                                        Tag       G(137) to A in nucleotide                                                                       pRDA3-2                                                     SED ID NO: 1                                                                  Gly (46) to Asp in amino                                                                        ASP46 Tag                                                   acid SEQ ID NO: 2                                                             Deletion of nucleotides                                                                         pTAQd2-76                                                   4-228 of nucleotide                                                           SEQ ID NO: 1                                                                  Deletion of amino acids                                                                         MET-ALA 77                                                  2-76 of amino acid                                                                              Tag                                                         SEQ ID NO: 2                                                                  Delection of nucleotides                                                                        pTAQd2-46                                                   4-138 of nucleotide                                                           SEQ ID NO: 1                                                                  Deletion of amino acids                                                                         MET-PHE 47                                                  2-46 of amino acid                                                                              Tag                                                         SEQ ID NO: 2                                                                  Deletion of nucleotides                                                                         pTAQd2-155                                                  4-462 of nucleotide                                                           SEQ ID NO: 1                                                                  Deletion of amino acids                                                                         MET-VAL 155                                                 2-154 of amino acid                                                                             Tag                                                         SEQ ID NO: 2                                                                  Deletion of nucleotides                                                                         pTAQd2-202                                                  4-606 of nucleotide                                                           SEQ ID NO: 1                                                                  Deletion of amino acids                                                                         MET-THR 203                                                 2-202 of amino acid                                                                             Tag                                                         SEQ ID NO: 2                                                                  Deletion of nucleotides                                                                         pLSG8                                                       4-867 of nucleotide                                                           SEQ ID NO: 1                                                                  Deletion of amino acids                                                                         MET-SER 290                                                 2-289 of amino acid                                                                             Tag                                                         SEQ ID NO: 2      (Stoffel                                                                      fragment)                                         Tma       G(110) to A in nucleotide                                                     SEQ ID NO: 3                                                                  Gly (37) to Asp in amino                                                                        ASP37 Tma                                                   acid SEQ ID NO: 4                                                             Deletion of nucleotides                                                                         pTMAd2-37                                                   4-131 of nucleotide                                                           SEQ ID NO: 3                                                                  Deletion of amino acids                                                                         MET-VAL 38                                                  2-37 of amino acid                                                                              Tma                                                         SEQ ID NO: 4                                                                  Deletion of nucleotides                                                                         pTMAd2-20                                                   4-60 of nucleotide                                                            SEQ ID NO: 3                                                                  Deletion of amino acids                                                                         MET-ASP 21                                                  2-20 of amino acid                                                                              Tma                                                         SEQ ID NO: 4                                                                  Deletion of nucleotides                                                                         pTMAd2-73                                                   4-219 of nucleotide                                                           SEQ ID NO: 3                                                                  Deletion of amino acids                                                                         MET-GLU 74                                                  2-73 amino acid   Tma                                                         SEQ ID NO: 4                                                                  Deletion of nucleotides                                                                         pTMA16                                                      1-417 of nucleotide                                                           SEQ ID NO: 3                                                                  Deletion of amino acids                                                                         MET 140                                                     1-139 of amino acid                                                                             Tma                                                         SEQ ID NO: 4                                                                  Deletion of nucleotides                                                                         pTMA15                                                      1-849 of nucleotide                                                           SEQ ID NO: 3                                                                  Deletion of amino acids                                                                         MET 284                                                     1-283 of amino acid                                                                             Tma                                                         SEQ ID NO: 4                                                        Tsps17    G(128) to A in nucleotide                                                     SEQ ID NO: 5                                                                  Gly (43) to Asp in amino                                                                        ASP43                                                       acid SEQ ID NO: 6 Tsps17                                                      Deletion of nucleotides                                                                         pSPSd2-43                                                   4-129 of nucleotide                                                           SEQ ID NO: 5                                                                  Deletion of amino acids                                                                         MET-PHE 44                                                  2-43 of amino acid                                                                              Tsps17                                                      SEQ ID NO: 6                                                                  Deletion of nucleotides                                                                         pSPSd2-73                                                   4-219 of nucleotide                                                           SEQ ID NO: 5                                                                  Deletion of amino acids                                                                         MET-ALA 74                                                  2-73 of amino acid                                                                              Tsps17                                                      SEQ ID NO: 6                                                                  Deletion of nucleotides                                                                         pSPSd2-151                                                  4-453 of nucleotide                                                           SEQ ID NO: 5                                                                  Deletion of amino acids                                                                         MET-LEU 152                                                 2-151 of amino acid                                                                             Tsps17                                                      SEQ ID NO: 6                                                                  Deletion of nucleotides                                                                         pSPSd2-199                                                  4-597 of nucleotide                                                           SEQ ID NO: 5                                                                  Deletion of amino acids                                                                         MET-THR 200                                                 2-199 of amino acid                                                                             Tsps17                                                      SEQ ID NO: 6                                                                  Deletion of nucleotides                                                                         pSPSA288                                                    4-861 of nucleotide                                                           SEQ ID NO: 5                                                                  Deletion of amino acids                                                                         MET-ALA 288                                                 2-287 of amino acid                                                                             Tsps 17                                                     SEQ ID NO: 6                                                        TZ05      G(137) to A in nucleotide                                                     SEQ ID NO: 7                                                                  Gly (46) to Asp in amino                                                                        ASP46 TZ05                                                  acid SEQ ID NO: 8                                                             Deletion of nucleotides                                                                         pZ05d2-46                                                   4-138 of nucleotide                                                           SEQ ID NO: 7                                                                  Deletion of amino acids                                                                         MET-PHE 47                                                  2-46 of amino acid                                                                              TZ05                                                        SEQ ID NO: 8                                                                  Deletion of nucleotides                                                                         pZ05d2-77                                                   4-231 of nucleotide                                                           SEQ ID NO: 7                                                                  Deletion of amino acids                                                                         MET-ALA 78                                                  2-77 of amino acid                                                                              TZ05                                                        SEQ ID NO: 8                                                                  Deletion of nucleotides                                                                         pZ05d2-155                                                  4-475 of nucleotide                                                           SEQ ID NO: 7                                                                  Deletion of amino acids                                                                         MET-VAL 156                                                 2-155 of amino acid                                                                             TZ05                                                        SEQ ID NO: 8                                                                  Deletion of nucleotides                                                                         pZ05d2-203                                                  4-609 of nucleotide                                                           SEQ ID NO: 7                                                                  Deletion of amino acids                                                                         MET-THR 204                                                 2-203 of amino acid                                                                             TZ05                                                        SEQ ID NO: 8                                                                  Deletion of nucleotides                                                                         pZ05A292                                                    4-873 of nucleotide                                                           SEQ ID NO: 7                                                                  Deletion of amino acids                                                                         MET-ALA 292                                                 2-291 of amino acid                                                                             TZ05                                                        SEQ ID NO: 8                                                        Tth       G(137) to A in nucleotide                                                     SEQ ID NO: 9                                                                  Gly (46) to Asp in amino                                                                        ASP46 Tth                                                   acid SEQ ID NO: 10                                                            Deletion of nucleotides                                                                         pTTHd2-46                                                   4-138 of nucleotide                                                           SEQ ID NO: 9                                                                  Deletion of amino acids                                                                         MET-PHE 47                                                  2-46 of amino acid                                                                              Tth                                                         SEQ ID NO: 10                                                                 Deletion of nucleotides                                                                         pTTHd2-77                                                   4-231 of nucleotide                                                           SEQ ID NO: 9                                                                  Deletion of amino acids                                                                         MET-ALA 78                                                  2-77 of amino acid                                                                              Tth                                                         SEQ ID NO: 10                                                                 Deletion of nucleotides                                                                         pTTHd2-155                                                  4-465 of nucleotide                                                           SEQ ID NO: 9                                                                  Deletion of amino acids                                                                         MET-VAL 156                                                 2-155 of amino acid                                                                             Tth                                                         SEQ ID NO: 10                                                                 Deletion of nucleotides                                                                         pTTHd2-203                                                  4-609 of nucleotide                                                           SEQ ID NO: 9                                                                  Deletion of amino acids                                                                         MET-THR 204                                                 2-203 of amino acid                                                                             Tth                                                         SEQ ID NO: 10                                                                 Deletion of nucleotides                                                                         pTTHA292                                                    4-873 of nucleotide                                                           SEQ ID NO: 9                                                                  Deletion of amino acids                                                                         MET-ALA 292                                                 2-291 of amino acid                                                                             Tth                                                         SEQ ID NO: 10                                                       Taf       G(110) to A and A(111)                                                        to T in nucleotide                                                            SEQ ID NO: 11                                                                 Gly (37) to Asp in amino                                                                        ASP37 Taf                                                   acid SEQ ID NO: 12                                                            Deletion of nucleotides                                                                         pTAFd2-37                                                   4-111 of nucleotide                                                           SEQ ID NO: 11                                                                 Deletion of amino acids                                                                         MET-LEU 38                                                  2-37 of amino acid                                                                              Taf                                                         SEQ ID NO: 12                                                                 Deletion of nucleotides                                                                         pTAF09                                                      4-279 of nucleotide                                                           SEQ ID NO: 11                                                                 Deletion of amino acids                                                                         MET-TYR 94                                                  2-93 amino acid   Taf                                                         SEQ ID NO: 12                                                                 Deletion of nucleotides                                                                         pTAF11                                                      4-417 of nucleotide                                                           SEQ ID NO: 11                                                                 Deletion of amino acids                                                                         MET-GLU 140                                                 2-139 of amino acid                                                                             Taf                                                         SEQ ID NO: 12                                                                 Deletion of nucleotides                                                                         pTAFd2-203                                                  4-609 of nucleotide                                                           SEQ ID NO: 11                                                                 Deletion of amino acids                                                                         MET-THR 204                                                 2-203 of amino acid                                                                             Taf                                                         SEQ ID NO: 12                                                                 Deletion of nucleotides                                                                         pTAFI285                                                    4-852 of nucleotide                                                           SEQ ID NO: 11                                                                 Deletion of amino acids                                                                         MET-ILE 285                                                 2-284 of amino acid                                                                             Taf                                                         SEQ ID NO: 12                                                       ______________________________________                                    

Thermostable DNA Polymerases with Enhanced 5' to 3' Exonuclease Activity

Another aspect of the present invention involves the generation ofthermostable DNA polymerases which exhibit enhanced or increased 5' to3' exonuclease activity over that of their respective nativepolymerases. The thermostable DNA polymerases of the present inventionwhich have increased or enhanced 5' to 3' exonuclease activity areparticularly useful in the homogeneous assay system described in PCTapplication No. 91/05571 filed Aug. 6, 1991, which published on Feb. 20,1992, which is incorporated herein by reference. Briefly, this system isa process for the detection of a target nucleic acid sequence in asample comprising:

(a) contacting a sample comprising single-stranded nucleic acids with anoligonucleotide containing a sequence complementary to a region of thetarget nucleic acid and a labeled oligonucleotide containing a sequencecomplementary to a second region of the same target nucleic acid strand,but not including the nucleic acid sequence defined by the firstoligonucleotide, to create a mixture of duplexes during hybridizationconditions, wherein the duplexes comprise the target nucleic acidannealed to the first oligonucleotide and to the labeled oligonucleotidesuch that the 3' end of the first oligonucleotide is adjacent to the 5'end of the labeled oligonucleotide;

(b) maintaining the mixture of step (a) with a template-dependentnucleic acid polymerase having a 5' to 3' nuclease activity underconditions sufficient to permit the 5' to 3' nuclease activity of thepolymerase to cleave the annealed, labeled oligonucleotide and releaselabeled fragments; and

(c) detecting and/or measuring the release of labeled fragments.

This homogeneous assay system is one which generates signal while thetarget sequence is amplified, thus, minimizing the post-amplificationhandling of the amplified product which is common to other assaysystems. Furthermore, a particularly preferred use of the thermostableDNA polymerases with increased 5' to 3' exonuclease activity is in ahomogeneous assay system which utilizes PCR technology. This particularassay system involves:

(a) providing to a PCR assay containing said sample, at least onelabeled oligonucleotide containing a sequence complementary to a regionof the target nucleic acid, wherein said labeled oligonucleotide annealswithin the target nucleic acid sequence bounded by the oligonucleotideprimers of step (b);

(b) providing a set of oligonucleotide primers, wherein a first primercontains a sequence complementary to a region in one strand of thetarget nucleic acid sequence and primes the synthesis of a complementaryDNA strand, and a second primer contains a sequence complementary to aregion in a second strand of the target nucleic acid sequence and primesthe synthesis of a complementary DNA strand; and wherein eacholigonucleotide primer is selected to anneal to its complementarytemplate upstream of any labeled oligonucleotide annealed to the samenucleic acid strand;

(c) amplifying the target nucleic acid sequence employing a nucleic acidpolymerase having 5' to 3' nuclease activity as a template-dependentpolymerizing agent under conditions which are permissive for PCR cyclingsteps of (i) annealing of primers and labeled oligonucleotide to atemplate nucleic acid sequence contained within the target region, and(ii) extending the primer, wherein said nucleic acid polymerasesynthesizes a primer extension product while the 5' to 3' nucleaseactivity of the nucleic acid polymerase simultaneously releases labeledfragments from the annealed duplexes comprising labeled oligonucleotideand its complementary template nucleic acid sequences, thereby creatingdetectable labeled fragments; and

(d) detecting and/or measuring the release of labeled fragments todetermine the presence or absence of target sequence in the sample.

The increased 5' to 3' exonuclease activity of the thermostable DNApolymerases of the present invention when used in the homogeneous assaysystems causes the cleavage of mononucleotides or small oligonucleotidesfrom an oligonucleotide annealed to its larger, complementarypolynucleotide. In order for cleavage to occur efficiently, an upstreamoligonucleotide must also be annealed to the same larger polynucleotide.

The 3' end of this upstream oligonucleotide provides the initial bindingsite for the nucleic acid polymerase. As soon as the bound polymeraseencounters the 5' end of the downstream oligonucleotide, the polymerasecan cleave mononucleotides or small oligonucleotides therefrom.

The two oligonucleotides can be designed such that they anneal in closeproximity on the complementary target nucleic acid such that binding ofthe nucleic acid polymerase to the 3' end of the upstreamoligonucleotide automatically puts it in contact with the 5' end of thedownstream oligonucleotide. This process, because polymerization is notrequired to bring the nucleic acid polymerase into position toaccomplish the cleavage, is called "polymerization-independentcleavage".

Alternatively, if the two oligonucleotides anneal to more distantlyspaced regions of the template nucleic acid target, polymerization mustoccur before the nucleic acid polymerase encounters the 5' end of thedownstream oligonucleotide. As the polymerization continues, thepolymerase progressively cleaves mononucleotides or smalloligonucleotides from the 5' end of the downstream oligonucleotide. Thiscleaving continues until the remainder of the downstream oligonucleotidehas been destabilized to the extent that it dissociates from thetemplate molecule. This process is called "polymerization-dependentcleavage".

The attachment of label to the downstream oligonucleotide permits thedetection of the cleaved mononucleotides and small oligonucleotides.Subsequently, any of several strategies may be employed to distinguishthe uncleaved labelled oligonucleotide from the cleaved fragmentsthereof. In this manner, nucleic acid samples which contain sequencescomplementary to the upstream and downstream oligonucleotides can beidentified. Stated differently, a labelled oligonucleotide is addedconcomittantly with the primer at the start of PCR, and the signalgenerated from hydrolysis of the labelled nucleotide(s) of the probeprovides a means for detection of the target sequence during itsamplification.

In the homogeneous assay system process, a sample is provided which issuspected of containing the particular oligonucleotide sequence ofinterest, the "target nucleic acid". The target nucleic acid containedin the sample may be first reverse transcribed into cDNA, if necessary,and then denatured, using any suitable denaturing method, includingphysical, chemical, or enzymatic means, which are known to those ofskill in the art. A preferred physical means for strand separationinvolves heating the nucleic acid until it is completely (>99%)denatured. Typical heat denaturation involves temperatures ranging fromabout 80° C. to about 105° C., for times ranging from a few seconds tominutes. As an alternative to denaturation, the target nucleic acid mayexist in a single-stranded form in the sample, such as, for example,single-stranded RNA or DNA viruses.

The denatured nucleic acid strands are then incubated with preselectedoligonucleotide primers and labeled oligonucleotide (also referred toherein as "probe") under hybridization conditions, conditions whichenable the binding of the primers and probes to the single nucleic acidstrands. As known in the art, the primers are selected so that theirrelative positions along a duplex sequence are such that an extensionproduct synthesized from one primer, when the extension product isseparated from its template (complement), serves as a template for theextension of the other primer to yield a replicate chain of definedlength.

Because the complementary strands are longer than either the probe orprimer, the strands have more points of contact and thus a greaterchance of finding each other over any given period of time. A high molarexcess of probe, plus the primer, helps tip the balance toward primerand probe annealing rather than template reannealing.

The primer must be sufficiently long to prime the synthesis of extensionproducts in the presence of the agent for polymerization. The exactlength and composition of the primer will depend on many factors,including temperature of the annealing reaction, source and compositionof the primer, proximity of the probe annealing site to the primerannealing site, and ratio of primer:probe concentration. For example,depending on the complexity of the target sequence, the oligonucleotideprimer typically contains about 15-30 nucleotides, although a primer maycontain more or fewer nucleotides. The primers must be sufficientlycomplementary to anneal to their respective strands selectively and formstable duplexes.

The primers used herein are selected to be "substantially" complementaryto the different strands of each specific sequence to be amplified. Theprimers need not reflect the exact sequence of the template, but must besufficiently complementary to hybridize selectively to their respectivestrands. Non-complementary bases or longer sequences can be interspersedinto the primer or located at the ends of the primer, provided theprimer retains sufficient complementarity with a template strand to forma stable duplex therewith. The non-complementary nucleotide sequences ofthe primers may include restriction enzyme sites.

In the practice of the homogeneous assay system, the labeledoligonucleotide probe must be first annealed to a complementary nucleicacid before the nucleic acid polymerase encounters this duplex region,thereby permitting the 5' to 3' exonuclease activity to cleave andrelease labeled oligonucleotide fragments.

To enhance the likelihood that the labeled oligonucleotide will haveannealed to a complementary nucleic acid before primer extensionpolymerization reaches this duplex region, or before the polymeraseattaches to the upstream oligonucleotide in thePolymerization-independent process, a variety of techniques may beemployed. For the polymerization-dependent process, one can position theprobe so that the 5'-end of the probe is relatively far from the 3'-endof the primer, thereby giving the probe more time to anneal beforeprimer extension blocks the probe binding site. Short primer moleculesgenerally require lower temperatures to form sufficiently stable hybridcomplexes with the target nucleic acid. Therefore, the labeledoligonucleotide can be designed to be longer than the primer so that thelabeled oligonucleotide anneals preferentially to the target at highertemperatures relative to primer annealing.

One can also use primers and labeled oligonucleotides havingdifferential thermal stability. For example, the nucleotide compositionof the labeled oligonucleotide can be chosen to have greater G/C contentand, consequently, greater thermal stability than the primer. In similarfashion, one can incorporate modified nucleotides into the probe, whichmodified nucleotides contain base analogs that form more stable basepairs than the bases that are typically present in naturally occurringnucleic acids.

Modifications of the probe that may facilitate probe binding prior toprimer binding to maximize the efficiency of the present assay includethe incorporation of positively charged or neutral phosphodiesterlinkages in the probe to decrease the repulsion of the polyanionicbackbones of the probe and target (see Letsinger et al., 1988, J. Amer.Chem. Soc. 110:4470); the incorporation of alkylated or halogenatedbases, such as 5-bromouridine, in the probe to increase base stacking;the incorporation of ribonucleotides into the probe to force theprobe:target duplex into an "A" structure, which has increased basestacking; and the substitution of 2,6-diaminopurine (amino adenosine)for some or all of the adenosines in the probe. In preparing suchmodified probes of the invention, one should recognize that the ratelimiting step of duplex formation is "nucleation", the formation of asingle base pair, and therefore, altering the biophysical characteristicof a portion of the probe, for instance, only the 3' or 5' terminalportion, can suffice to achieve the desired result. In addition, becausethe 3' terminal portion of the probe (the 3' terminal 8 to 12nucleotides) dissociates following exonuclease degradation of the 5'terminus by the polymerase, modifications of the 3' terminus can be madewithout concern about interference with polymerase/nuclease activity.

The thermocycling parameters can also be varied to take advantage of thedifferential thermal stability of the labeled oligonucleotide andprimer. For example, following the denaturation step in thermocycling,an intermediate temperature may be introduced which is permissible forlabeled oligonucleotide binding but not primer binding, and then thetemperature is further reduced to permit primer annealing and extension.One should note, however, that probe cleavage need only occur in latercycles of the PCR process for suitable results. Thus, one could set upthe reaction mixture so that even though primers initially bindpreferentially to probes, primer concentration is reduced through primerextension so that, in later cycles, probes bind preferentially toprimers.

To favor binding of the labeled oligonucleotide before the primer, ahigh molar excess of labeled oligonucleotide to primer concentration canalso be used. In this embodiment, labeled oligonucleotide concentrationsare typically in the range of about 2 to 20 times higher than therespective primer concentration, which is generally 0.5-5×10⁻⁷ M. Thoseof skill recognize that oligonucleotide concentration, length, and basecomposition are each important factors that affect the T_(m) of anyparticular oligonucleotide in a reaction mixture. Each of these factorscan be manipulated to create a thermodynamic bias to favor probeannealing over primer annealing.

Of course, the homogeneous assay system can be applied to systems thatdo not involve amplification. In fact, the present invention does noteven require that polymerization occur. One advantage of thepolymerization-independent process lies in the elimination of the needfor amplification of the target sequence. In the absence of primerextension, the target nucleic acid is substantially single-stranded.Provided the primer and labeled oligonucleotide are adjacently bound tothe target nucleic acid, sequential rounds of oligonucleotide annealingand cleavage of labeled fragments can occur. Thus, a sufficient amountof labeled fragments can be generated, making detection possible in theabsence of polymerization. As would be appreciated by those skilled inthe art, the signal generated during PCR amplification could beaugmented by this Polymerization-independent activity.

In addition to the homogeneous assay systems described above, thethermostable DNA polymerases of the present invention with enhanced 5'to 3' exonuclease activity are also useful in other amplificationsystems, such as the transcription amplification system, in which one ofthe PCR primers encodes a promoter that is used to make RNA copies ofthe target sequence. In similar fashion, the present invention can beused in a self-sustained sequence replication (3SR) system, in which avariety of enzymes are used to make RNA transcripts that are then usedto make DNA copies, all at a single temperature. By incorporating apolymerase with 5' to 3' exonuclease activity into a ligase chainreaction (LCR) system, together with appropriate oligonucleotides, onecan also employ the present invention to detect LCR products.

Also, just as 5' to 3' exonuclease deficient thermostable DNApolymerases are useful in PLCR, other thermostable DNA polymerases whichhave 5' to 3' exonuclease activity are also useful in PLCR underdifferent circumstances. Such is the case when the 5' tail of thedownstream primer in PLCR is non-complementary to the target DNA. Suchnon-complementarity causes a forked structure where the 5' end of theupstream primer would normally anneal to the target DNA.

Thermostable ligases cannot act on such forked structures. However, thepresence of 5' to 3' exonuclease activity in the thermostable DNApolymerase will cause the excision of the forked 5' tail of the upstreamprimer, thus permitting the ligase to act.

The same processes and techniques which are described above as effectivefor preparing thermostable DNA polymerases with attenuated 5' to 3'exonuclease activity are also effective for preparing the thermostableDNA polymerases with enhanced 5' to 3' exonuclease activity. Asdescribed above, these processes include such techniques assite-directed mutagenesis, deletion mutagenesis and "domain shuffling".

Of particular usefulness in preparing the thermostable DNA polymeraseswith enhanced 5' to 3' exonuclease activity is the "domain shuffling"technique described above. To briefly summarize, this technique involvesthe cleavage of a specific domain of a polymerase which is recognized ascoding for a very active 5' to 3' exonuclease activity of thatpolymerase, and then transferring that domain into the appropriate areaof a second thermostable DNA polymerase gene which encodes a lower levelor no 5' to 3' exonuclease activity. The desired domain may replace adomain which encodes an undesired property of the second thermostableDNA polymerase or be added to the nucleotide sequence of the secondthermostable DNA polymerase.

A particular "domain shuffling" example is set forth above in which theTma DNA polymerase coding sequence comprising codons about 291 through484 is substituted for the Taq DNA polymerase I codons 289 through 422.This substitution yields a novel thermostable DNA polymerase containingthe 5' to 3' exonuclease domain of Taq DNA polymerase (codons 1-289),the 3' to 5' exonuclease domain of Tma DNA polymerase (codons 291-484)and the DNA polymerase domain of Taq DNA polymerase (codons 423-832).However, those skilled in the art will recognize that othersubstitutions can be made in order to construct a thermostable DNApolymerase with certain desired characteristics such as enhanced 5' to3' exonuclease activity.

The following examples are offered by way of illustration only and areby no means intended to limit the scope of the claimed invention. Inthese examples, all percentages are by weight if for solids and byvolume if for liquids, unless otherwise noted, and all temperatures aregiven in degrees Celsius.

EXAMPLE 1 Preparation of a 5' to 3' Exonuclease Mutant of Taq DNAPolymerase by Random Mutagenesis PCR of the Known 5' to 3' ExonucleaseDomain

Preparation of Insert

Plasmid pLSG12 was used as a template for PCR. This plasmid is a HindIIIminus version of pLSG5 in which the Taq polymerase gene nucleotides616-621 of SEQ ID NO:1 were changed from AAGCTT to AAGCTG. This changeeliminated the HindIII recognition sequence within the Taq polymerasegene without altering encoded protein sequence.

Using oligonucleotides MK61 (AGGACTACAACTGCCACACACC) (SEQ ID NO:21) andRA01 (CGAGGCGCGCCAGCCCCAGGAGATCTACCAGCTCCTTG) (SEQ ID NO:22) as primersand pLSG12 as the template, PCR was conducted to amplify a 384 bpfragment containing the ATG start of the Tag polymerase gene, as well asan additional 331 bp of coding sequence downstream of the ATG startcodon.

A 100 μl PCR was conducted for 25 cycles utilizing the following amountsof the following agents and reactants:

50 pmol of primer MK61 (SEQ ID NO:21);

50 pmol of primer RA01 (SEQ ID NO:22);

50 μM of each dNTP;

10 mM Tris-HCl, pH 8.3;

50 mM KCl;

1.5 mM MgCl₂ ;

75.6 pg pLSG12;

2.5 units AmpliTaq DNA polymerase.

The PCR reaction mixture described was placed in a Perkin-Elmer Cetusthermal cycler and run through the following profile. The reactionmixture was first ramped up to 98° C. over 1 minute and 45 seconds, andheld at 98° C. for 25 seconds. The reaction mixture was then ramped downto 55° C. over 45 seconds and held at that temperature for 20 seconds.Finally, the mixture was ramped up to 72° C. over 45 seconds, and heldat 72° C. for 30 seconds. A final 5 minute extension occurred at 72° C.

The PCR product was then extracted with chloroform and precipitated withisopropanol using techniques which are well known in the art.

A 300 ng sample of the PCR product was digested with 20 U of HindIII (in30 μl reaction) for 2 hours at 37° C. Then, an additional digestion wasmade with 8 U of BssHII for an 2 hours at 50° C. This series ofdigestions yielded a 330 bp fragment for cloning.

A vector was prepared by digesting 5.3 μg of pLSG12 with 20 U HindIII(in 40 μl) for 2 hours at 37° C. This digestion was followed by additionof 12 U of BssHII and incubation for 2 hours at 50° C.

The vector was dephosphorylated by treatment with CIAP (calf intestinalalkaline phosphatase), specifically 0.04 U CIAP for 30 minutes at 30° C.Then, 4 μl of 500 mM EGTA was added to the vector preparation to stopthe reaction, and the phosphatase was inactivated by incubation at 65°C. for 45 minutes.

225 ng of the phosphatased vector described above was ligated at a 1:1molar ratio with 10 ng of the PCR-derived insert.

Then, DG116 cells were transformed with one fifth of the ligationmixture, and ampicillin-resistant transformants were selected at 30° C.

Appropriate colonies were grown overnight at 30° C. to OD₆₀₀ 0.7. Cellscontaining the P_(L) vectors were induced at 37° C. in a shaking waterbath for 4, 9, or 20 hours, and the preparations were sonicated and heattreated at 75° C. in the presence of 0.2M ammonium sulfate. Finally, theextracts were assayed for polymerase activity and 5' to 3' exonucleaseactivity.

The 5' to 3' exonuclease activity was quantified utilizing the 5' to 3'exonuclease assay described above. Specifically, the synthetic 3'phosphorylated oligonucleotide probe (phosphorylated to precludepolymerase extension) BW33 (GATCGCTGCGCGTAACCACCACACCCGCCGCGCp) (SEQ IDNO:13) (100 pmol) was ³² P-labeled at the 5' end with gamma-[³² P] ATP(3000 Ci/mmol) and T4 polynucleotide kinase. The reaction mixture wasextracted with phenol:chloroform:isoamyl alcohol, followed by ethanolprecipitation. The ³² P-labeled oligonucleotide probe was redissolved in100 μl of TE buffer, and unincorporated ATP was removed by gelfiltration chromatography on a Sephadex G-50 spin column. Five pmol of³² P-labeled BW33 probe, was annealed to 5 pmol of single-strandM13mp10w DNA, in the presence of 5 pmol of the synthetic oligonucleotideprimer BW37 (GCGCTAGGGCGCTGGCAAGTGTAGCGGTCA) (SEQ ID NO:14) in a 100 μlreaction containing 10 mM Tris-HCl (pH 8.3), 50 mM KCl, and 3 mM MgCl₂.The annealing mixture was heated to 95° C. for 5 minutes, cooled to 70°C. over 10 minutes, incubated at 70° C. for an additional 10 minutes,and then cooled to 25° C. over a 30 minute period in a Perkin-ElmerCetus DNA thermal cycler. Exonuclease reactions containing 10 μl of theannealing mixture were pre-incubated at 70° C. for 1 minute. Thethermostable DNA polymerase preparations of the invention (approximately0.3 U of enzyme activity) were added in a 2.5 μl volume to thepre-incubation reaction, and the reaction mixture was incubated at 70°C. Aliquots (5 μl) were removed after 1 minute and 5 minutes, andstopped by the addition of 1 μl of 60 mM EDTA. The reaction productswere analyzed by homochromatography and exonuclease activity wasquantified following autoradiography. Chromatography was carried out ina homochromatography mix containing 2% partially hydrolyzed yeast RNA in7M urea on Polygram CEL 300 DEAE cellulose thin layer chromatographyplates. The presence of 5' to 3' exonuclease activity resulted in thegeneration of small ³² P-labeled oligomers, which migrated up the TLCplate, and were easily differentiated on the autoradiogram fromundegraded probe, which remained at the origin.

The clone 3-2 had an expected level of polymerase activity but barelydetectable 5' to 3' exonuclease activity. This represented a greaterthan 1000-fold reduction in 5' to 3' exonuclease activity from thatpresent in native Taq DNA polymerase.

This clone was then sequenced and it was found that G (137) was mutatedto an A in the DNA sequence. This mutation results in a Gly (46) to Aspmutation in the amino acid sequence of the Taq DNA polymerase, thusyielding a thermostable DNA polymerase of the present invention withsignificantly attenuated 5' to 3' exonuclease activity.

The recovered protein was purified according to the Taq DNA polymeraseprotocol which is taught in Ser. No. 523,394 filed May 15, 1990, whichissued as U.S. Pat. No. 5,079,352, incorporated herein by reference.

EXAMPLE 2 Construction of Met 289 (Δ289) 544 Amino Acid Form of TaqPolymerase

As indicated in Example 9 of U.S. Ser. No. 523,394, filed May 15, 1990,during a purification of native Taq polymerase an altered form of Taqpolymerase was obtained that catalyzed the template dependentincorporation of dNTP at 70° C. This altered form of Taq polymerase wasimmunologically related to the approximate 90 kDa form of purifiednative Taq polymerase but was of lower molecular weight. Based onmobility, relative to BSA and ovalbumin following SDS-PAGEelectrophoresis, the apparent molecular weight of this form isapproximately 61 kDa. This altered form of the enzyme is not present incarefully prepared crude extracts of Thermus aquaticus cells asdetermined by SDS-PAGE Western blot analysis or in situ DNA polymeraseactivity determination (Spanos, A., and Hubscher, U. (1983) Meth. Enz.91:263-277) following SDS-PAGE gel electrophoresis. This form appears tobe a proteolytic artifact that may arise during sample handling. Thislower molecular weight form was purified to homogeneity and subjected toN-terminal sequence determination on an ABI automated gas phasesequencer. Comparison of the obtained N-terminal sequence with thepredicted amino acid sequence of the Taq polymerase gene (SEQ ID NO:1)indicates this shorter form arose as a result of proteolytic cleavagebetween Glu(289) and Ser(290).

To obtain a further truncated form of a Taq polymerase gene that woulddirect the synthesis of a 544 amino acid primary translation product,plasmids pFC54.t, pSYC1578 and the complementary syntheticoligonucleotides DG29 (5'-AGCTTATGTCTCCAAAAGCT) (SEQ ID NO:23) and DG30(5'-AGCTTTTGGAGACATA) (SEQ ID NO:24) were used. Plasmid pFC54.t wasdigested to completion with HindIII and BamHI. Plasmid pSYC1578 wasdigested with BstXI (at nucleotides 872 to 883 of SEQ ID NO:1) andtreated with E. coli DNA polymerase I Klenow fragment in the presence ofall 4 dNTPs to remove the 4 nucleotide 3' cohesive end and generate aCTG-terminated duplex blunt end encoding Leu294 in the Taq polymerasesequence (see Taq polymerase SEQ ID NO:1 nucleotides 880-882). The DNAsample was digested to completion with BglII and the approximate 1.6 kbBstXI (repaired)/BglII Taq DNA fragment was purified by agarose gelelectrophoresis and electroelution. The pFC54.t plasmid digest (0.1pmole) was ligated with the Taq polymerase gene fragment (0.3 pmole) andannealed nonphosphorylated DG29/DG30 duplex adaptor (0.5 pmole) understicky ligase conditions at 30 μg/ml, 15° C. overnight. The DNA wasdiluted to approximately 10 microgram per ml and ligation continuedunder blunt end conditions. The ligated DNA sample was digested withXbaI to linearize (inactivate) any IL-2 mutein-encoding ligationproducts. 80 nanograms of the ligated and digested DNA was used totransform E. coli K12 strain DG116 to ampicillin resistance. Amp^(R)candidates were screened for the presence of an approximate 7.17 kbplasmid which yielded the expected digestion products with EcoRI (4,781bp+2,386 bp), PstI (4,138 bp+3,029 bp), ApaI (7,167 bp) and HindIII/PstI(3,400 bp+3,029 bp+738 bp). E. coli colonies harboring candidateplasmids were screened by single colony immunoblot for thetemperature-inducible synthesis of an approximate 61 kDa Taq polymeraserelated polypeptide. In addition, candidate plasmids were subjected toDNA sequence determination at the 5' λP_(L) promoter:Taq DNA junctionand the 3' Taq DNA:BT cry PRE junction. One of the plasmids encoding theintended DNA sequence and directing the synthesis of atemperature-inducible 61 kDa Taq polymerase related polypeptide wasdesignated pLSG8.

Expression of 61 kDa Taq Pol

Cultures containing pLSG8 were grown as taught in Ser. No. 523,364 anddescribed in Example 3 below. The 61 kDa Taq Pol appears not to bedegraded upon heat-induction at 41° C. After 21 hours at 41° C., aheat-treated crude extract from a culture harboring pLSG8 had 12,310units of heat-stable DNA polymerase activity per mg crude extractprotein, a 24-fold increase over an uninduced culture. A heat-treatedextract from a 21 hour 37° C.-induced pLSG8 culture had 9,503 units ofactivity per mg crude extract protein. A nine-fold increase inaccumulated levels of Taq Pol I was observed between a 5 hour and 21hour induction at 37° C. and a nearly four-fold increase between a 5hour and 21 hour induction at 41° C. The same total protein andheat-treated extracts were analyzed by SDS-PAGE. 20 μg crude extractprotein or heat-treated crude extract from' 20 μg crude extract proteinwere applied to each lane of the gel. The major bands readily apparentin both the 37° C. and 41° C., 21 hour-induced total protein lanes areequally intense as their heat-treated counterparts. Heat-treated crudeextracts from 20 μg of total protein from 37° C. and 41° C., 21 hoursamples contain 186 units and 243 units of thermostable DNA polymeraseactivity, respectively. To determine the usefulness of 61 kDa Taq DNApolymerase in PCR, PCR assays were performed using heat-treated crudeextracts from induced cultures of pLSG8. Heat-treated crude extract frominduced cultures of pLSG5 were used as the source of full-length Taq Polin PCR. PCR product was observed in reactions utilizing 4 units and 2units of truncated enzyme. There was more product in those PCRs than inany of the full-length enzyme reactions. In addition, no non-specifichigher molecular weight products were visible.

Purification of 61 kDa Tag Pol

Purification of 61 kDa Taq Pol from induced pLSG8/DG116 cells proceededas the purification of full-length Taq Pol as in Example 12 of U.S. Ser.No. 523,394, filed May 15, 1990 which issued as U.S. Pat. No. 5,079,352with some modifications.

Induced pLSG8/DG116 cells (15.6 g) were homogenized and lysed asdescribed in U.S. Ser. No. 523,394, filed May 15, 1990 and in Example 3below. Fraction I contained 1.87 g protein and 1.047×10⁶ units ofactivity. Fraction II, obtained as a 0.2M ammonium sulfate supernatantcontained 1.84 g protein and 1.28×10⁶ units of activity in 74 ml.

Following heat treatment, Polymin P (pH 7.5) was added slowly to 0.7%.Following centrifugation, the supernatant, Fraction III contained 155 mgprotein and 1.48×10⁶ units of activity.

Fraction III was loaded onto a 1.15×3.1 cm (3.2 ml) phenyl sepharosecolumn at 10 ml/cm² /hour. All of the applied activity was retained onthe column. The column was washed with 15 ml of the equilibration bufferand then 5 ml (1.5 column volumes) of 0.1M KCl in TE. The polymeraseactivity was eluted with 2M urea in TE containing 20% ethylene glycol.Fractions (0.5 ml each) with polymerase activity were pooled (8.5 ml)and dialyzed into heparin sepharose buffer containing 0.1M KCl. Thedialyzed material, Fraction IV (12.5 ml), contained 5.63 mg of proteinand 1.29×10⁶ units of activity.

Fraction IV was loaded onto a 1.0 ml bed volume heparin sepharose columnequilibrated as above. The column was washed with 6 ml of the samebuffer (A₂₈₀ to baseline) and eluted with a 15 ml linear 0.1-0.5M KClgradient in the same buffer. Fractions (0.15 ml) eluting between 0.16and 0.27M KCl were analyzed by SDS-PAGE. A minor (<1%) contaminatingapproximately 47 kDa protein copurified with 61 kDa Taq Pol I. Fractionseluting between 0.165 and 0.255M KCl were pooled (2.5 ml) anddiafiltered on a Centricon 30 membrane into 2.5× storage buffer.Fraction V contained 2.8 mg of protein and 1.033×10⁶ units of 61 kDa TaqPol.

PCR Using Purified 61 kDa Taq Pol

PCR reactions (50 μl) containing 0.5 ng lambda DNA, 10 pmol each of twolambda-specific primers, 200 μM each dNTPs, 10 mM Tris-Cl, pH 8.3, 3 mMMgCl₂, 10 mM KCl and 3.5 units of 61 kDa Taq Pol were performed. As acomparison, PCR reactions were performed with 1.25 units of full-lengthTaq Pol, as above, with the substitution of 2 mM MgCl₂ and 50 mM KCl.Thermocycling conditions were 1 minute at 95° C. and 1 minute at 60° C.for 23 cycles, with a final 5 minute extension at 75° C. The amount ofDNA per reaction was quantitated by the Hoechst fluorescent dye assay.1.11 μg of product was obtained with 61 kDa Taq Pol (2.2×10⁵ -foldamplification), as compared with 0.70 μg of DNA with full-length Taq Pol(1.4×10⁵ -fold amplification).

Thermostability of 61 kDa Taq Pol

Steady state thermal inactivation of recombinant 94 kDa Taq Pol and 61kDa Taq Pol was performed at 97.5° C. under buffer conditions mimickingPCR. 94 kDa Taq Pol has an apparent half-life of approximately 9 minuteat 97.5° C., whereas the half-life of 61 kDa Taq Pol was approximately21 minutes. The thermal inactivation of 61 kDa Taq Pol was unaffected byKCl concentration over a range from 0 to 50 mM.

Yet another truncated Tag polymerase gene contained within the ˜2.68 kbHindIII-Asp718 fragment of plasmid pFC85 can be expressed using, forexample, plasmid pP_(L) N_(RBS) ATG, by operably linking theamino-terminal HindIII restriction site encoding the Taq pol gene to anATG initiation codon. The product of this fusion upon expression willyield an ˜70,000-72,000 dalton truncated polymerase.

This specific construction can be made by digesting plasmid pFC85 withHindIII and treating with Klenow fragment in the presence of dATP anddGTP. The resulting fragment is treated further with S1 nuclease toremove any single-stranded extensions and the resulting DNA digestedwith ASP718 and treated with Klenow fragment in the presence of all fourdNTPs. The recovered fragment can be ligated using T4 DNA ligase todephosphorylated plasmid pP_(L) N_(RBS) ATG, which had been digestedwith SacI and treated with Klenow fragment in the presence of dGTP toconstruct an ATG blunt end. This ligation mixture can then be used totransform E. coli DG116 and the transformants screened for production ofTaq polymerase. Expression can be confirmed by Western immunoblotanalysis and activity analysis.

EXAMPLE 3 Construction, Expression and Purification of a Truncated 5' to3' Exonuclease Deficient Tma Polymerase (MET284)

To express a 5' to 3' exonuclease deficient Tma DNA polymerase lackingamino acids 1-283 of native Tma DNA polymerase the following steps wereperformed.

Plasmid pTma12-1 was digested with BspHI (nucleotide position 848) andHindIII (nucleotide position 2629). A 1781 base pair fragment wasisolated by agarose gel purification. To separate the agarose from theDNA, a gel slice containing the desired fragment was frozen at -20° C.in a Costar spinex filter unit. After thawing at room temperature, theunit was spun in a microfuge. The filtrate containing the DNA wasconcentrated in a Speed Vac concentrator, and the DNA was precipitatedwith ethanol.

The isolated fragment was cloned into plasmid pTma12-1 digested withNCoI and HindIII. Because NcoI digestion leaves the same cohesive endsequence as digestion with BspHI, the 1781 base pair fragment has thesame cohesive ends as the full length fragment excised from plasmidpTma12-1 by digestion with NcoI and HindIII. The ligation of theisolated fragment with the digested plasmid results in a fragment switchand was used to create a plasmid designated pTma14.

Plasmid pTma15 was similarly constructed by cloning the same isolatedfragment into pTma13. As with pTma14, pTma15 drives expression of apolymerase that lacks amino acids 1 through 283 of native Tma DNApolymerase; translation initiates at the methionine codon at position284 of the native coding sequence.

Both the pTma14 and pTma15 expression plasmids expressed at a high levela biologically active thermostable DNA polymerase devoid of 5' to 3'exonuclease activity of molecular weight of about 70 kDa; plasmid pTma15expressed polymerase at a higher level than did pTma14. Based onsimilarities with E. coli Pol I Klenow fragment, such as conservation ofamino acid sequence motifs in all three domains that are critical for 3'to 5' exonuclease activity, distance from the amino terminus to thefirst domain critical for exonuclease activity, and length of theexpressed protein, the shortened form (MET284) of Tma DNA polymeraseexhibits 3' to 5' exonuclease or proof-reading activity but lacks 5' to3' exonuclease activity. Initial SDS activity gel assays and solutionassays for 3' to 5' exonuclease activity suggest attenuation in thelevel of proof-reading activity of the polymerase expressed by E. colihost cells harboring plasmid pTma15.

MET284 Tma DNA polymerase was purified from E. coli strain DG116containing plasmid pTma15. The seed flask for a 10 L fermentationcontained tryptone (20 g/l), yeast extract (10 g/l), NaCl (10 g/l),glucose (10 g/l), ampicillin (50 mg/l), and thiamine (10 mg/l). The seedflask was innoculated with a colony from an agar plate (a frozenglycerol culture can be used). The seed flask was grown at 30° C. tobetween 0.5 to 2.0 O.D. (A₆₈₀). The volume of seed culture inoculatedinto the fermentor is calculated such that the bacterial concentrationis 0.5 mg dry weight/liter. The 10 liter growth medium contained 25 mMKH₂ PO₄, 10 mM (NH₄)₂ SO₄, 4 mM sodium citrate, 0.4 mM FeCl₃, 0.04 mMZnCl₂, 0.03 mM COCl₂, 0.03 mM CuCl₂, and 0.03 mM H₃ BO₃. The followingsterile components were added: 4 mM MgSO₄, 20 g/l glucose, 20 mg/lthiamine, and 50 mg/l ampicillin. The pH was adjusted to 6.8 with NaOHand controlled during the fermentation by added NH₄ OH. Glucose wascontinually added by coupling to NH₄ OH addition. Foaming was controlledby the addition of propylene glycol as necessary, as an antifoamingagent. Dissolved oxygen concentration was maintained at 40%.

The fermentor was inoculated as described above, and the culture wasgrown at 30° C. to a cell density of 0.5 to 1.0×10¹⁰ cells/ml (opticaldensity [A₆₈₀ ] of 15). The growth temperature was shifted to 38° C. toinduce the synthesis of MET284 Tma DNA polymerase. The temperature shiftincreases the copy number of the pTma15 plasmid and simultaneouslyderepresses the lambda P_(L) promoter controlling transcription of themodified Tma DNA polymerase gene through inactivation of thetemperature-sensitive cI repressor encoded by the defective prophagelysogen in the host.

The cells were grown for 6 hours to an optical density of 37 (A₆₈₀) andharvested by centrifugation. The cell mass (ca. 95 g/l) was resuspendedin an equivalent volume of buffer containing 50 mM Tris-Cl, pH 7.6, 20mM EDTA and 20% (w/v) glycerol. The suspension was slowly dripped intoliquid nitrogen to freeze the suspension as "beads" or small pellets.The frozen cells were stored at -70° C.

To 200 g of frozen beads (containing 100 g wet weight cell) were added100 ml of 1× TE (50 mM Tris-Cl, pH 7.5, 10 mM EDTA) and DTT to 0.3 mM,PMSF to 2.4 mM, leupeptin to 1 μg/ml and TLCK (a protease inhibitor) to0.2 mM. The sample was thawed on ice and uniformly resuspended in ablender at low speed. The cell suspension was lysed in an Aminco frenchpressure cell at 20,000 psi. To reduce viscosity, the lysed cell samplewas sonicated 4 times for 3 min. each at 50% duty cycle and 70% output.The sonicate was adjusted to 550 ml with 1× TE containing 1 mM DTT, 2.4mM PMSF, 1 μg/ml leupeptin and 0.2 mM TLCK (Fraction I). After additionof ammonium sulfate to 0.3M, the crude lysate was rapidly brought to 75°C. in a boiling water bath and transferred to a 75° C. water bath for 15min. to denature and inactivate E. coli host proteins. The heat-treatedsample was chilled rapidly to 0° C. and incubated on ice for 20 min.Precipitated proteins and cell membranes were removed by centrifugationat 20,000×G for 30 min. at 5° C. and the supernatant (Fraction II)saved.

The heat-treated supernatant (Fraction II) was treated withpolyethyleneimine (PEI) to remove most of the DNA and RNA. Polymin P(34.96 ml of 10% [w/v], pH 7.5) was slowly added to 437 ml of FractionII at 0° C. while stirring rapidly. After 30 min. at 0° C., the samplewas centrifuged at 20,000×G for 30 min. The supernatant (Fraction III)was applied at 80 ml/hr to a 100 ml phenyl sepharose column (3.2×12.5cm) that had been equilibrated in 50 mM Tris-Cl, pH 7.5, 0.3M ammoniumsulfate, 10 mM EDTA, and 1 mM DTT. The column was washed with about 200ml of the same buffer (A₂₈₀ to baseline) and then with 150 ml of 50 mMTris-Cl, pH 7.5, 100 mM KCl, 10 mM EDTA and 1 mM DTT. The MET284 Tma DNApolymerase was then eluted from the column with buffer containing 50 mMTris-Cl, pH 7.5, 2M urea, 20% (w/v) ethylene glycol, 10 mM EDTA, and 1mM DTT, and fractions containing DNA polymerase activity were pooled(Fraction IV).

Fraction IV is adjusted to a conductivity equivalent to 50 mM KCl in 50mM Tris-Cl, pH 7.5, 1 mM EDTA, and 1 mM DTT. The sample was applied (at9 ml/hr) to a 15 ml heparin-sepharose column that had been equilibratedin the same buffer. The column was washed with the same buffer at ca. 14ml/hr (3.5 column volumes) and eluted with a 150 ml 0.05 to 0.5M KClgradient in the same buffer. The DNA polymerase activity eluted between0.11-0.22M KCl. Fractions containing the pTma15 encoded modifed Tma DNApolymerase are pooled, concentrated, and diafiltered against 2.5×storage buffer (50 mM Tris-Cl, pH 8.0, 250 mM KCl, 0.25 mM EDTA, 2.5 mMDTT, and 0.5% Tween 20), subsequently mixed with 1.5 volumes of sterile80% (w/v) glycerol, and stored at -20° C. Optionally, the heparinsepharose-eluted DNA polymerase or the phenyl sepharose-eluted DNApolymerase can be dialyzed or adjusted to a conductivity equivalent to50 mM KCl in 50 mM Tris-Cl, pH 7.5, 1 mM DTT, 1 mM EDTA, and 0.2% Tween20 and applied (1 mg protein/ml resin) to an affigel blue column thathas been equilibrated in the same buffer. The column is washed withthree to five column volumes of the same buffer and eluted with a 10column volume KCl gradient (0.05 to 0.8M) in the same buffer. Fractionscontaining DNA polymerase activity (eluting between 0.25 and 0.4M KCl)are pooled, concentrated, diafiltered, and stored as above.

The relative thermoresistance of various DNA polymerases has beencompared. At 97.5° C. the half-life of native Tma DNA polymerase is morethan twice the half-life of either native or recombinant Taq (i.e.,AmpliTaq) DNA polymerase. Surprisingly, the half-life at 97.5° C. ofMET284 Tma DNA polymerase is 2.5 to 3 times longer than the half-life ofnative Tma DNA polymerase.

PCR tubes containing 10 mM Tris-Cl, pH 8.3, and 1.5 mM MgCl₂ (for Taq ornative Tma DNA polymerase) or 3 mM MgCl₂ (for MET284 Tma DNApolymerase), 50 mM KCl (for Taq, native Tma and MET284 Tma DNApolymerases) or no KCl (for MET284 Tma DNA polymerase), 0.5 μM each ofprimers PCR01 and PCR02, 1 ng of lambda template DNA, 200 μM of eachdNTP except dCTP, and 4 units of each enzyme were incubated at 97.5° C.in a large water bath for times ranging from 0 to 60 min. Samples werewithdrawn with time, stored at 0° C., and 5 μl assayed at 75° C. for 10min. in a standard activity assay for residual activity.

Taq DNA polymerase had a half-life of about 10 min. at 97.5° C., whilenative Tma DNA polymerase had a half-life of about 21 to 22 min. at97.5° C. Surprisingly, the MET284 form of Tma DNA polymerase had asignificanlty longer half-life (50 to 55 min.) than either Taq or nativeTma DNA polymerase. The improved thermoresistance of MET284 Tma DNApolymerase will find applications in PCR, particularly where G+C-richtargets are difficult to amplify because the strand-separationtemperature required for complete denaturation of target and PCR productsequences leads to enzyme inactivation.

PCR tubes containing 50 μl of 10 mM Tris-Cl, pH 8.3, 3 mM MgCl₂, 200 μMof each dNTP, 0.5 ng bacteriophage lambda DNA, 0.5 μM of primer PCR01, 4units of MET284 Tma DNA polymerase, and 0.5 μM of primer PCR02 or PL10were cycled for 25 cycles using T_(den) of 96° C. for 1 min. andT_(anneal-extend) of 60° C. for 2 min. Lambda DNA template,deoxynucleotide stock solutions, and primers PCR01 and PCR02 were partof the PECI GeneAmp kit. Primer PL10 has the sequence:5'-GGCGTACCTTTGTCTCACGGGCAAC-3' (SEQ ID NO:25) and is complementary tobacteriophage lambda nucleotides 8106-8130.

The primers PCR01 and PCR02 amplify a 500 bp product from lambda. Theprimer pair PCR01 and PL10 amplify a 1 kb product from lambda. Afteramplification with the respective primer sets, 5 μl aliquots weresubjected to agarose gel electrophoresis and the specific intendedproduct bands visualized with ethidium bromide staining. Abundant levelsof product were generated with both primer sets, showing that MET284 TmaDNA polymerase successfully amplified the intended target sequence.

EXAMPLE 4 Expression of Truncated Tma DNA Polymerase

To express a 5' to 3' exonuclease deficient form of Tma DNA polymerasewhich initiates translation at MET 140, the coding region correspondingto amino acids 1 through 139 was deleted from the expression vector. Theprotocol for constructing such a deletion is similar to the constructiondescribed in Examples 2 and 3: a shortened gene fragment is excised andthen reinserted into a vector from which a full length fragment has beenexcised. However, the shortened fragment can be obtained as a PCRamplification product rather than purified from a restriction digest.This methodology allows a new upstream restriction site (or othersequences) to be incorporated where useful.

To delete the region up to the methionine codon at position 140, an SphIsite was introduced into pTma12-1 and pTma13 using PCR. A forward primercorresponding to nucleotides 409-436 of Tma DNA polymerase SEQ ID NO:3(FL63) was designed to introduce an SphI site just upstream of themethionine codon at position 140. The reverse primer corresponding tothe complement of nucleotides 608-634 of Tma DNA polymerase SEQ ID NO:3(FL69) was chosen to include an XbaI site at position 621. PlasmidpTma12-1 linearized with SmaI was used as the PCR template, yielding anapproximate 225 bp PCR product.

Before digestion, the PCR product was treated with 50 μg/ml ofProteinase K in PCR reaction mix plus 0.5% SDS and 5 mM EDTA. Afterincubating for 30 minutes at 37° C., the Proteinase K was heatinactivated at 68° C. for 10 minutes. This procedure eliminated any Taqpolymerase bound to the product that could inhibit subsequentrestriction digests. The buffer was changed to a TE buffer, and theexcess PCR primers were removed with a Centricon 100 microconcentrator.

The amplified fragment was digested with SphI, then treated with Klenowto create a blunt end at the SphI-cleaved end, and finally digested withXbaI. The resulting fragment was ligated with plasmid pTma13 (pTma12-1would have been suitable) that had been digested with NcoI, repairedwith Klenow, and then digested with XbaI. The ligation yielded anin-frame coding sequence with the region following the NcoI site (at thefirst methionine codon of the coding sequence) and the introduced SphIsite (upstream of the methionine codon at position 140) deleted. Theresulting expression vector was designated pTma16.

The primers used in this example are given below and in the SequenceListing section.

    __________________________________________________________________________    Primer                                                                             SEQ ID NO:                                                                             Sequence                                                        __________________________________________________________________________    FL63 SEQ ID NO: 26                                                                          5'GATAAAGGCATGCTTCAGCTTGTGAACG                                  FL69 SEQ ID NO: 27                                                                          5'TGTACTTCTCTAGAAGCTGAACAGCAG                                   __________________________________________________________________________

EXAMPLE 5 Elimination of Undesired RBS in MET140 Expression Vectors

Reduced expression of the MET140 form of Tma DNA polymerase can beachieved by eliminating the ribosome binding site (RBS) upstream of themethionine codon at position 140. The RBS was be eliminated viaoligonucleotide site-directed mutagenesis without changing the aminoacid sequence. Taking advantage of the redundancy of the genetic code,one can make changes in the third position of codons to alter thenucleic acid sequence, thereby eliminating the RBS, without changing theamino acid sequence of the encoded protein.

A mutagenic primer (FL64) containing the modified sequence wassynthesized and phosphorylated. Single-stranded pTma09 (a full lengthclone having an NcoI site) was prepared by coinfecting with the helperphage R408, commercially available from Stratagene. A "gapped duplex" ofsingle stranded pTma09 and the large fragment from the PvuII digestionof pBS13+ was created by mixing the two plasmids, heating to boiling for2 minutes, and cooling to 65° C. for 5 minutes. The phosphorylatedprimer was then annealed with the "gapped duplex" by mixing, heating to80° C. for 2 minutes, and then cooling slowly to room temperature. Theremaining gaps were filled by extension with Klenow and the fragmentsligated with T4 DNA ligase, both reactions taking place in 200 μM ofeach dNTP and 40 μM ATP in standard salts at 37° C. for 30 minutes.

The resulting circular fragment was transformed into DG101 host cells byplate transformations on nitrocellulose filters. Duplicate filters weremade and the presence of the correct plasmid was detected by probingwith a ³² P-phosphorylated probe (FL65). The vector that resulted wasdesignated pTma19.

The RBS minus portion from pTma19 was cloned into pTma12-1 via anNcoI/XbaI fragment switch. Plasmid pTma19 was digested with NcoI andXbaI, and the 620 bp fragment was purified by gel electrophoresis, as inExample 3, above. Plasmid pTma12-1 was digested with NcoI, XbaI, andXcmI. The XcmI cleavage inactivates the RBS+ fragment for the subsequentligation step, which is done under conditions suitable for ligating"sticky" ends (dilute ligase and 40 μM ATP). Finally, the ligationproduct is transformed into DG116 host cells for expression anddesignated pTma19-RBS.

The oligonucleotide sequences used in this example are listed below andin the Sequence Listing section.

    __________________________________________________________________________    Oligo SEQ ID NO: Sequence                                                     __________________________________________________________________________    FL64  SEQ ID NO: 28                                                                            5'CTGAAGCATGTCTTTGTCACCGGT-                                                   TACTATGAATAT                                                 FL65  SEQ ID NO: 29                                                                            5'TAGTAACCGGTGACAAAG                                         __________________________________________________________________________

EXAMPLE 6 Expression of Truncated Tma DNA Polymerases MET-ASP21 andMET-GLU74

To effect translation initiation at the aspartic acid codon at position21 of the Tma DNA polymerase gene coding sequence, a methionine codon isintroduced before the codon, and the region from the initial NcoI siteto this introduced methionine codon is deleted. Similar to Example 4,the deletion process involved PCR with the same downstream primerdescribed above (FL69) and an upstream primer (FL66) designed toincorporate an NcoI site and a methionine codon to yield a 570 base pairproduct.

The amplified product was concentrated with a Centricon-100microconcentrator to eliminate excess primers and buffer. The productwas concentrated in a Speed Vac concentrator and then resuspended in thedigestion mix. The amplified product was digested with NcoI and XbaI.Likewise, pTma12-1, pTma13, or pTma19-RBS was digested with the same tworestriction enzymes, and the digested, amplified fragment is ligatedwith the digested expression vector. The resulting construct has adeletion from the NcoI site upstream of the start codon of the nativeTma coding sequence to the new methionine codon introduced upstream ofthe aspartic acid codon at position 21 of the native Tma codingsequence.

Similarly, a deletion mutant was created such that translationinitiation begins at Glu74, the glutamic acid codon at position 74 ofthe native Tma coding sequence. An upstream primer (FL67) is designed tointroduce a methionine codon and an NcoI site before Glu74. Thedownstream primer and cloning protocol used are as described above forthe MET-ASP21 construct.

The upstream primer sequences used in this example are listed below andin the Sequence Listing section.

    ______________________________________                                        Oligo SEQ ID NO: Sequence                                                     ______________________________________                                        FL66  SEQ ID NO: 5'CTATGCCATGGATAGATCGCTT-                                          30         TCTACTTCC                                                    FL67  SEQ ID NO: 5'CAAGCCCATGGAAACTTACAAG-                                          31         GCTCAAAGA                                                    ______________________________________                                    

EXAMPLE 7 Expression of Truncated Taf Polymerase

Mutein forms of the Taf polymerase lacking 5' to 3' exonuclease activitywere constructed by introducing deletions in the 5'end of the Tafpolymerase gene. Both 279 and 417 base pair deletions were created usingthe following protocol; an expression plasmid was digested withrestriction enzymes to excise the desired fragment, the fragment endswere repaired with Klenow and all four dNTP/s, to produce blunt ends,and the products were ligated to produce a new circular plasmid with thedesired deletion. To express a 93 kilodalton, 5' to 3'exonuclease-deficient form of Taf polymerase, a 279 bp deletioncomprising amino acids 2-93 was generated. To express an 88 kilodalton,5' to 3' exonuclease-deficient form of Taf polymerase, 417 bp deletioncomprising amino acids 2-139 was generated.

To create a plasmid with codons 2-93 deleted, pTaf03 was digested withNcoI and NdeI and the ends were repaired by Klenow treatment. Thedigested and repaired plasmid was diluted to 5 μg/ml and ligated underblunt end conditions. The dilute plasmid concentration favorsintramolecular ligations. The ligated plasmid was transformed intoDG116. Mini-screen DNA preparations were subjected to restrictionanalysis and correct plasmids were confirmed by DNA sequence analysis.The resulting expression vector created by deleting a segment frompTaf03 was designated pTaf09. A similar vector created from pTaf05 wasdesignated pTaf10.

Expression vectors also were created with codons 2-139 deleted. The sameprotocol was used with the exception that the initial restrictiondigestion was performed with NcoI and BglII. The expression vectorcreated from pTaf03 was designated pTaf11 and the expression vectorcreated from pTaf05 was designated pTaf12.

EXAMPLE 8 Derivation and Expression of 5' to 3' Exonuclease-Deficient,Thermostable DNA Polymerase of Thermus species, Z05 Comprising AminoAcids 292 through 834

To obtain a DNA fragment encoding a 5' to 3' exonuclease-deficientthermostable DNA polymerase from Thermus species Z05, a portion of theDNA polymerase gene comprising amino acids 292 through 834 isselectively amplified in a PCR with forward primer TZA292 and reverseprimer TZR01 as follows:

50 pmoles TZA292

50 pmoles TZR01

10 ng Thermus sp. Z05 genomic DNA

2.5 units AmpliTaq DNA polymerase

50 μM each dATP, dGTP, dCTP, dTTP

in an 80 μl solution containing 10 mM Tris-HCl pH 8.3, 50 mM KCl andoverlaid with 100 μl of mineral oil. The reaction was initiated byaddition of 20 μl containing 7.5 mM MgCl₂ after the tubes had beenplaced in an 80° C. preheated cycler.

The genomic DNA was digested to completion with restriction endonucleaseAsp718, denatured at 98° C. for 5 minutes and cooled rapidly to 0° C.The sample was cycled in a Perkin-Elmer Cetus Thermal Cycler accordingto the following profile:

STEP CYCLE to 96° C. and hold for 20 seconds.

STEP CYCLE to 55° C. and hold for 30 seconds.

RAMP to 72° C. over 30 seconds and hold for 1 minute.

REPEAT profile for 3 cycles.

STEP CYCLE to 96° C. and hold for 20 seconds.

STEP CYCLE to 65° C. and hold for 2 minutes.

REPEAT profile for 25 cycles.

After last cycle HOLD for 5 minutes.

The intended 1.65 kb PCR product is purified by agarose gelelectrophoresis, and recovered following phenol-chloroform extractionand ethanol precipitation. The purified product is digested withrestriction endonucleases NdeI and BglII and ligated withNdeI/BamHI-digested and dephosphorylated plasmid vector pDG164 (U.S.Ser. No. 455,967, filed Dec. 22, 1989, Example 6B, which was filed inthe PCT as PCT/US90/07639 and published on Jul. 11, 1991, and which isincorporated herein by reference). Ampicillin-resistant transformants ofE. coli strain DG116 are selected at 30° C. and screened for the desiredrecombinant plasmid. Plasmid pZ05A292 encodes a 544 amino acid, 5' to 3'exonuclease-deficient Thermus sp. Z05 thermostable DNA polymeraseanalogous to the pLSG8 encoded protein of Example 2. The DNA polymeraseactivity is purified as in Example 2. The purified protein is deficientin 5' to 3' exonuclease activity, is more thermoresistant than thecorresponding native enzyme and is particularly useful in PCR ofG+C-rich templates.

    __________________________________________________________________________    Primer                                                                             SEQ ID NO:                                                                             SEQUENCE                                                        __________________________________________________________________________    TZA292                                                                             SEQ ID NO: 32                                                                          GTCGGCATATGGCTCCTGCTCCTCTTGAGGA-                                              GGCCCCCTGGCCCCCGCC                                              TZR01                                                                              SEQ ID NO: 33                                                                          GACGCAGATCTCAGCCCTTGGCGGAAAGCCA-                                              GTCCTC                                                          __________________________________________________________________________

EXAMPLE 9 Derivation and Expression of 5' to 3' Exonuclease-Deficient,Thermostable DNA Polymerase of Thermus species sps17 Comprising AminoAcids 288 through 830

To obtain a DNA fragment encoding 5' to 3' exonuclease-deficientthermostable DNA polymerase from Thermus species sps17, a portion of theDNA polymerase gene comprising amino acids 288 through 830 isselectively amplified in a PCR with forward primer TSA288 and reverseprimer TSR01 as follows:

50 pmoles TSA288

50 pmoles TSR01

10 ng Thermus sp. sps17 genomic DNA

2.5 units AmpliTaq DNA polymerase

50 μM each dATP, dGTP, dCTP, dTTP

in an 80 μl solution containing 10 mM Tris-HCl pH 8.3, 50 mM KCl andoverlaid with 100 μl of mineral oil. The reaction was initiated byaddition of 20 μl containing 7.5 mM MgCl₂ after the tubes had beenplaced in an 80° C. preheated cycler.

The genomic DNA was denatured at 98° C. for 5 minutes and cooled rapidlyto 0° C. The sample was cycled in a Perkin-Elmer Cetus Thermal Cycleraccording to the following profile:

STEP CYCLE to 96° C. and hold for 20 seconds.

STEP CYCLE to 55° C. and hold for 30 seconds.

RAMP to 72° C. over 30 seconds and hold for 1 minute.

REPEAT profile for 3 cycles.

STEP CYCLE to 96° C. and hold for 20 seconds.

STEP CYCLE to 65° C. and hold for 2 minutes.

REPEAT profile for 25 cycles.

After last cycle HOLD for 5 minutes.

The intended 1.65 kb PCR product is purified by agarose gelelectrophoresis, and recovered following phenol-chloroform extractionand ethanol precipitation. The purified product is digested withrestriction endonucleases NdeI and BglII and ligated withNdeI/BamHI-digested and dephosphorylated plasmid vector pDG164 (U.S.Ser. No. 455,967, filed Dec. 12, 1989, Example 6B, which was filed inthe PCT as PCT/US90/07639 and published on Jul. 11, 1991).Ampicillin-resistant transformants of E. coli strain DG116 are selectedat 30° C. and screened for the desired recombinant plasmid. PlasmidpSPSA288 encodes a 544 amino acid, 5' to 3' exonuclease-deficientThermus sp. sps17 thermostable DNA polymerase analogous to the pLSG8encoded protein of Example 2. The DNA polymerase activity is purified asin Example 2. The purified protein is deficient in 5' to 3' exonucleaseactivity, is more thermoresistant than the corresponding native enzymeand is particularly useful in PCR of G+C-rich templates.

    __________________________________________________________________________    Primer                                                                             SEQ ID NO:                                                                             SEQUENCE                                                        __________________________________________________________________________    TSA288                                                                             SEQ ID NO: 34                                                                          GTCGGCATATGGCTCCTAAAGAAGCTGAGGA-                                              GGCCCCCTGGCCCCCGCC                                              TSR01                                                                              SEQ ID NO: 35                                                                          GACGCAGATCTCAGGCCTTGGCGGAAAGCCA-                                              GTCCTC                                                          __________________________________________________________________________

EXAMPLE 10 Derivation and Expression of 5' to 3' Exonuclease-Deficient,Thermostable DNA Polymerase of Thermus thermophilus Comprising AminoAcids 292 through 834

To obtain a DNA fragment encoding a 5' to 3' exonuclease-deficientthermostable DNA polymerase from Thermus thermophilus, a portion of theDNA polymerase gene comprising amino acids 292 through 834 isselectively amplified in a PCR with forward primer TZA292 and reverseprimer DG122 as follows;

50 pmoles TZA292

50 pmoles DG122

1 ng EcoRI digested plasmid pLSG22

2.5 units AmpliTaq DNA polymerase

50 μM each dATP, dGTP, dCTP, dTTP

in an 80 μl solution containing 10 mM Tris-HCl pH 8.3, 50 mM KCl andoverlaid with 100 μl of mineral oil. The reaction was initiated byaddition of 20 μl containing 7.5 mM MgCl₂ after the tubes had beenplaced in an 80° C. preheated cycler.

Plasmid pLSG22 (U.S. Ser. No. 455,967, filed Dec. 22, 1989, Example 4A,which was filed in the PCT as PCT/US90/07639 and published on Jul. 11,1991, and which is incorporated herein by reference) was digested tocompletion with restriction endonuclease EcoRI, denatured at 98° C. for5 minutes and cooled rapidly to 0° C. The sample was cycled in aPerkin-Elmer Cetus Thermal Cycler according to the following profile:

STEP CYCLE to 96° C. and hold for 20 seconds.

STEP CYCLE to 55° C. and hold for 30 seconds.

RAMP to 72° C. over 30 seconds and hold for 1 minute.

REPEAT profile for 3 cycles.

STEP CYCLE to 96° C. and hold for 20 seconds.

STEP CYCLE to 65° C. and hold for 2 minutes.

REPEAT profile for 20 cycles.

After last cycle HOLD for 5 minutes.

The intended 1.66 kb PCR product is purified by agarose gelelectrophoresis, and recovered following phenol-chloroform extractionand ethanol precipitation. The purified product is digested withrestriction endonucleases NdeI and BglII and ligated withNdeI/BamHI-digested and dephosphorylated plasmid vector pDG164 (U.S.Ser. No. 455,967, filed Dec. 12, 1989, Example 6B). Ampicillin-resistanttransformants of E. coli strain DG116 are selected at 30° C. andscreened for the desired recombinant plasmid. Plasmid pTTHA292 encodes a544 amino acid, 5' to 3' exonuclease-deficient Thermus thermophilusthermostable DNA polymerase analogous to the pLSG8 encoded protein ofExample 2. The DNA polymerase activity is purified as in Example 2. Thepurified protein is deficient in 5' to 3' exonuclease activity, is morethermoresistant than the corresponding native enzyme and is particularlyuseful in PCR of G+C-rich templates.

    __________________________________________________________________________    Primer                                                                             SEQ ID NO:                                                                             SEQUENCE                                                        __________________________________________________________________________    TZA292                                                                             SEQ ID NO: 32                                                                          GTCGGCATATGGCTCCTGCTCCTCTTGAGGA-                                              GGCCCCCTGGCCCCCGCC                                              DG122                                                                              SEQ ID NO: 36                                                                          CCTCTAAACGGCAGATCTGATATCAACCCTT-                                              GGCGGAAAGC                                                      __________________________________________________________________________

EXAMPLE 11 Derivation and Expression of 5' to 3' Exonuclease-Deficient,Thermostable DNA Polymerase of Thermosipho africanus Comprising AminoAcids 285 through 892

To obtain a DNA fragment encoding a 5' to 3' exonuclease-deficientthermostable DNA polymerase from Thermosipho africanus, a portion of theDNA polymerase gene comprising amino acids 285 through 892 isselectively amplified in a PCR with forward primer TAFI285 and reverseprimer TAFR01 as follows:

50 pmoles TAFI285

50 pmoles TAFR01

1 ng plasmid pBSM:TafRV3' DNA

2.5 units AmpliTaq DNA polymerase

50 μM each dATP, dGTP, dCTP, dTTP

in an 80 μl solution containing 10 mM Tris-HCl pH 8.3, 50 mM KCl andoverlaid with 100 μl of mineral oil. The reaction was initiated byaddition of 20 μl containing 7.5 mM MgCl₂ after the tubes had beenplaced in an 80° C. preheated cycler.

Plasmid pBSM:TafRV'3 (obtained as described in PCT Patent ApplicationNo. PCT/US91/07076, which published on Apr. 16, 1992, EX 4, p53,incorporated herein by reference) was digested with EcoRI to completionand the DNA was denatured at 98° C. for 5 minutes and cooled rapidly to0° C. The sample was cycled in a Perkin-Elmer Cetus Thermal Cycleraccording to the following profile:

STEP CYCLE to 95° C. and hold for 30 seconds.

STEP CYCLE to 55° C. and hold for 30 seconds.

RAMP to 72° C. over 30 seconds and hold for 1 minute.

REPEAT profile for 3 cycles.

STEP CYCLE to 95° C. and hold for 30 minutes.

STEP CYCLE to 65° C. and hold for 2 minutes.

REPEAT profile for 20 cycles.

After last cycle HOLD for 5 minutes.

The intended 1.86 kb PCR product is purified by agarose gelelectrophoresis, and recovered following phenol-chloroform extractionand ethanol precipitation. The purified product is digested withrestriction endonucleases NdeI and BamHI and ligated withNdeI/BamHI-digested and dephosphorylated plasmid vector pDG164 (U.S.Ser. No. 455,967, filed Dec. 22, 1989, Example 6B which was filed in thePCT as PCT/US90/07639 and published on Jul. 11, 1991).Ampicillin-resistant transformants of E. coli strain DG116 are selectedat 30° C. and screened for the desired recombinant plasmid. PlasmidpTAFI285 encodes a 609 amino acid, 5' to 3' exonuclease-deficientThermosipho africanus thermostable DNA polymerase analogous to thepTMA15-encoded protein of Example 3. The DNA polymerase activity ispurified as in Example 3. The purified protein is deficient in 5' to 3'exonuclease activity, is more thermoresistant than the correspondingnative enzyme and is particularly useful in PCR of G+C-rich templates.

    __________________________________________________________________________    Primer                                                                             SEQ ID NO:                                                                             SEQUENCE                                                        __________________________________________________________________________    TAFI285                                                                            SEQ ID NO: 37                                                                          GTCGGCATATGATTAAAGAACTTAATTTACA-                                              AGAAAAATTAGAAAAGG                                               TAFR01                                                                             SEQ ID NO: 38                                                                          CCTTTACCCCAGGATCCTCATTCCCACTCTT-                                              TTCCATAATAAACAT                                                 __________________________________________________________________________

The foregoing written specification is considered to be sufficient toenable one skilled in the art to practice the invention. The presentinvention is not to be limited in scope by the cell lines deposited,since the deposited embodiment is intended as a single illustration ofone aspect of the invention and any cell lines that are functionallyequivalent are within the scope of this invention. The deposits ofmaterials therein does not constitute an admission that the writtendescription herein contained is inadequate to enable the practice of anyaspect of the invention, including the best mode thereof, nor are thedeposits to be construed as limiting the scope of the claims to thespecific illustrations theft they represent. Indeed, variousmodifications of the invention in addition to those shown and describedherein will become apparent to those skilled in the art from theforegoing description and fall within the scope of the appended claims.

    __________________________________________________________________________    SEQUENCE LISTING                                                              (1) GENERAL INFORMATION:                                                      (iii) NUMBER OF SEQUENCES: 38                                                 (2) INFORMATION FOR SEQ ID NO:1:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 2499 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                          (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: Thermus aquaticus                                               (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 1..2496                                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                       ATGAGGGGGATGCTGCCCCTCTTTGAGCCCAAGGGCCGGGTCCTCCTG48                            MetArgGly MetLeuProLeuPheGluProLysGlyArgValLeuLeu                             151015                                                                        GTGGACGGCCACCACCTGGCCTACCGCACCTTCCACGCCCTGAAGGGC96                            ValAsp GlyHisHisLeuAlaTyrArgThrPheHisAlaLeuLysGly                             202530                                                                        CTCACCACCAGCCGGGGGGAGCCGGTGCAGGCGGTCTACGGCTTCGCC144                           LeuThr ThrSerArgGlyGluProValGlnAlaValTyrGlyPheAla                             354045                                                                        AAGAGCCTCCTCAAGGCCCTCAAGGAGGACGGGGACGCGGTGATCGTG192                           LysSerLeu LeuLysAlaLeuLysGluAspGlyAspAlaValIleVal                             505560                                                                        GTCTTTGACGCCAAGGCCCCCTCCTTCCGCCACGAGGCCTACGGGGGG240                           ValPheAspAlaLys AlaProSerPheArgHisGluAlaTyrGlyGly                             65707580                                                                      TACAAGGCGGGCCGGGCCCCCACGCCGGAGGACTTTCCCCGGCAACTC288                           TyrLysAla GlyArgAlaProThrProGluAspPheProArgGlnLeu                             859095                                                                        GCCCTCATCAAGGAGCTGGTGGACCTCCTGGGGCTGGCGCGCCTCGAG336                           AlaLeu IleLysGluLeuValAspLeuLeuGlyLeuAlaArgLeuGlu                             100105110                                                                     GTCCCGGGCTACGAGGCGGACGACGTCCTGGCCAGCCTGGCCAAGAAG384                           ValPro GlyTyrGluAlaAspAspValLeuAlaSerLeuAlaLysLys                             115120125                                                                     GCGGAAAAGGAGGGCTACGAGGTCCGCATCCTCACCGCCGACAAAGAC432                           AlaGluLys GluGlyTyrGluValArgIleLeuThrAlaAspLysAsp                             130135140                                                                     CTTTACCAGCTCCTTTCCGACCGCATCCACGTCCTCCACCCCGAGGGG480                           LeuTyrGlnLeuLeu SerAspArgIleHisValLeuHisProGluGly                             145150155160                                                                  TACCTCATCACCCCGGCCTGGCTTTGGGAAAAGTACGGCCTGAGGCCC528                           TyrLeuIle ThrProAlaTrpLeuTrpGluLysTyrGlyLeuArgPro                             165170175                                                                     GACCAGTGGGCCGACTACCGGGCCCTGACCGGGGACGAGTCCGACAAC576                           AspGln TrpAlaAspTyrArgAlaLeuThrGlyAspGluSerAspAsn                             180185190                                                                     CTTCCCGGGGTCAAGGGCATCGGGGAGAAGACGGCGAGGAAGCTTCTG624                           LeuPro GlyValLysGlyIleGlyGluLysThrAlaArgLysLeuLeu                             195200205                                                                     GAGGAGTGGGGGAGCCTGGAAGCCCTCCTCAAGAACCTGGACCGGCTG672                           GluGluTrp GlySerLeuGluAlaLeuLeuLysAsnLeuAspArgLeu                             210215220                                                                     AAGCCCGCCATCCGGGAGAAGATCCTGGCCCACATGGACGATCTGAAG720                           LysProAlaIleArg GluLysIleLeuAlaHisMetAspAspLeuLys                             225230235240                                                                  CTCTCCTGGGACCTGGCCAAGGTGCGCACCGACCTGCCCCTGGAGGTG768                           LeuSerTrp AspLeuAlaLysValArgThrAspLeuProLeuGluVal                             245250255                                                                     GACTTCGCCAAAAGGCGGGAGCCCGACCGGGAGAGGCTTAGGGCCTTT816                           AspPhe AlaLysArgArgGluProAspArgGluArgLeuArgAlaPhe                             260265270                                                                     CTGGAGAGGCTTGAGTTTGGCAGCCTCCTCCACGAGTTCGGCCTTCTG864                           LeuGlu ArgLeuGluPheGlySerLeuLeuHisGluPheGlyLeuLeu                             275280285                                                                     GAAAGCCCCAAGGCCCTGGAGGAGGCCCCCTGGCCCCCGCCGGAAGGG912                           GluSerPro LysAlaLeuGluGluAlaProTrpProProProGluGly                             290295300                                                                     GCCTTCGTGGGCTTTGTGCTTTCCCGCAAGGAGCCCATGTGGGCCGAT960                           AlaPheValGlyPhe ValLeuSerArgLysGluProMetTrpAlaAsp                             305310315320                                                                  CTTCTGGCCCTGGCCGCCGCCAGGGGGGGCCGGGTCCACCGGGCCCCC1008                          LeuLeuAla LeuAlaAlaAlaArgGlyGlyArgValHisArgAlaPro                             325330335                                                                     GAGCCTTATAAAGCCCTCAGGGACCTGAAGGAGGCGCGGGGGCTTCTC1056                          GluPro TyrLysAlaLeuArgAspLeuLysGluAlaArgGlyLeuLeu                             340345350                                                                     GCCAAAGACCTGAGCGTTCTGGCCCTGAGGGAAGGCCTTGGCCTCCCG1104                          AlaLys AspLeuSerValLeuAlaLeuArgGluGlyLeuGlyLeuPro                             355360365                                                                     CCCGGCGACGACCCCATGCTCCTCGCCTACCTCCTGGACCCTTCCAAC1152                          ProGlyAsp AspProMetLeuLeuAlaTyrLeuLeuAspProSerAsn                             370375380                                                                     ACCACCCCCGAGGGGGTGGCCCGGCGCTACGGCGGGGAGTGGACGGAG1200                          ThrThrProGluGly ValAlaArgArgTyrGlyGlyGluTrpThrGlu                             385390395400                                                                  GAGGCGGGGGAGCGGGCCGCCCTTTCCGAGAGGCTCTTCGCCAACCTG1248                          GluAlaGly GluArgAlaAlaLeuSerGluArgLeuPheAlaAsnLeu                             405410415                                                                     TGGGGGAGGCTTGAGGGGGAGGAGAGGCTCCTTTGGCTTTACCGGGAG1296                          TrpGly ArgLeuGluGlyGluGluArgLeuLeuTrpLeuTyrArgGlu                             420425430                                                                     GTGGAGAGGCCCCTTTCCGCTGTCCTGGCCCACATGGAGGCCACGGGG1344                          ValGlu ArgProLeuSerAlaValLeuAlaHisMetGluAlaThrGly                             435440445                                                                     GTGCGCCTGGACGTGGCCTATCTCAGGGCCTTGTCCCTGGAGGTGGCC1392                          ValArgLeu AspValAlaTyrLeuArgAlaLeuSerLeuGluValAla                             450455460                                                                     GAGGAGATCGCCCGCCTCGAGGCCGAGGTCTTCCGCCTGGCCGGCCAC1440                          GluGluIleAlaArg LeuGluAlaGluValPheArgLeuAlaGlyHis                             465470475480                                                                  CCCTTCAACCTCAACTCCCGGGACCAGCTGGAAAGGGTCCTCTTTGAC1488                          ProPheAsn LeuAsnSerArgAspGlnLeuGluArgValLeuPheAsp                             485490495                                                                     GAGCTAGGGCTTCCCGCCATCGGCAAGACGGAGAAGACCGGCAAGCGC1536                          GluLeu GlyLeuProAlaIleGlyLysThrGluLysThrGlyLysArg                             500505510                                                                     TCCACCAGCGCCGCCGTCCTGGAGGCCCTCCGCGAGGCCCACCCCATC1584                          SerThr SerAlaAlaValLeuGluAlaLeuArgGluAlaHisProIle                             515520525                                                                     GTGGAGAAGATCCTGCAGTACCGGGAGCTCACCAAGCTGAAGAGCACC1632                          ValGluLys IleLeuGlnTyrArgGluLeuThrLysLeuLysSerThr                             530535540                                                                     TACATTGACCCCTTGCCGGACCTCATCCACCCCAGGACGGGCCGCCTC1680                          TyrIleAspProLeu ProAspLeuIleHisProArgThrGlyArgLeu                             545550555560                                                                  CACACCCGCTTCAACCAGACGGCCACGGCCACGGGCAGGCTAAGTAGC1728                          HisThrArg PheAsnGlnThrAlaThrAlaThrGlyArgLeuSerSer                             565570575                                                                     TCCGATCCCAACCTCCAGAACATCCCCGTCCGCACCCCGCTTGGGCAG1776                          SerAsp ProAsnLeuGlnAsnIleProValArgThrProLeuGlyGln                             580585590                                                                     AGGATCCGCCGGGCCTTCATCGCCGAGGAGGGGTGGCTATTGGTGGCC1824                          ArgIle ArgArgAlaPheIleAlaGluGluGlyTrpLeuLeuValAla                             595600605                                                                     CTGGACTATAGCCAGATAGAGCTCAGGGTGCTGGCCCACCTCTCCGGC1872                          LeuAspTyr SerGlnIleGluLeuArgValLeuAlaHisLeuSerGly                             610615620                                                                     GACGAGAACCTGATCCGGGTCTTCCAGGAGGGGCGGGACATCCACACG1920                          AspGluAsnLeuIle ArgValPheGlnGluGlyArgAspIleHisThr                             625630635640                                                                  GAGACCGCCAGCTGGATGTTCGGCGTCCCCCGGGAGGCCGTGGACCCC1968                          GluThrAla SerTrpMetPheGlyValProArgGluAlaValAspPro                             645650655                                                                     CTGATGCGCCGGGCGGCCAAGACCATCAACTTCGGGGTCCTCTACGGC2016                          LeuMet ArgArgAlaAlaLysThrIleAsnPheGlyValLeuTyrGly                             660665670                                                                     ATGTCGGCCCACCGCCTCTCCCAGGAGCTAGCCATCCCTTACGAGGAG2064                          MetSer AlaHisArgLeuSerGlnGluLeuAlaIleProTyrGluGlu                             675680685                                                                     GCCCAGGCCTTCATTGAGCGCTACTTTCAGAGCTTCCCCAAGGTGCGG2112                          AlaGlnAla PheIleGluArgTyrPheGlnSerPheProLysValArg                             690695700                                                                     GCCTGGATTGAGAAGACCCTGGAGGAGGGCAGGAGGCGGGGGTACGTG2160                          AlaTrpIleGluLys ThrLeuGluGluGlyArgArgArgGlyTyrVal                             705710715720                                                                  GAGACCCTCTTCGGCCGCCGCCGCTACGTGCCAGACCTAGAGGCCCGG2208                          GluThrLeu PheGlyArgArgArgTyrValProAspLeuGluAlaArg                             725730735                                                                     GTGAAGAGCGTGCGGGAGGCGGCCGAGCGCATGGCCTTCAACATGCCC2256                          ValLys SerValArgGluAlaAlaGluArgMetAlaPheAsnMetPro                             740745750                                                                     GTCCAGGGCACCGCCGCCGACCTCATGAAGCTGGCTATGGTGAAGCTC2304                          ValGln GlyThrAlaAlaAspLeuMetLysLeuAlaMetValLysLeu                             755760765                                                                     TTCCCCAGGCTGGAGGAAATGGGGGCCAGGATGCTCCTTCAGGTCCAC2352                          PheProArg LeuGluGluMetGlyAlaArgMetLeuLeuGlnValHis                             770775780                                                                     GACGAGCTGGTCCTCGAGGCCCCAAAAGAGAGGGCGGAGGCCGTGGCC2400                          AspGluLeuValLeu GluAlaProLysGluArgAlaGluAlaValAla                             785790795800                                                                  CGGCTGGCCAAGGAGGTCATGGAGGGGGTGTATCCCCTGGCCGTGCCC2448                          ArgLeuAla LysGluValMetGluGlyValTyrProLeuAlaValPro                             805810815                                                                     CTGGAGGTGGAGGTGGGGATAGGGGAGGACTGGCTCTCCGCCAAGGAG2496                          LeuGlu ValGluValGlyIleGlyGluAspTrpLeuSerAlaLysGlu                             820825830                                                                     TGA2499                                                                       (2) INFORMATION FOR SEQ ID NO:2:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 832 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                       MetArgGlyMetLeuProLeuPheGluProLysGlyArgValLeuLeu                              15 1015                                                                       ValAspGlyHisHisLeuAlaTyrArgThrPheHisAlaLeuLysGly                              202530                                                                        LeuThrThrSerArgGlyGluProVal GlnAlaValTyrGlyPheAla                             354045                                                                        LysSerLeuLeuLysAlaLeuLysGluAspGlyAspAlaValIleVal                              505560                                                                        ValPheAspAlaLysAlaProSerPheArgHisGluAlaTyrGlyGly                              65707580                                                                      TyrLysAlaGlyArgAlaProThrProGluAspPheProArgGlnLeu                               859095                                                                       AlaLeuIleLysGluLeuValAspLeuLeuGlyLeuAlaArgLeuGlu                              100105110                                                                     ValProGly TyrGluAlaAspAspValLeuAlaSerLeuAlaLysLys                             115120125                                                                     AlaGluLysGluGlyTyrGluValArgIleLeuThrAlaAspLysAsp                              130 135140                                                                    LeuTyrGlnLeuLeuSerAspArgIleHisValLeuHisProGluGly                              145150155160                                                                  TyrLeuIleThrProAlaTrpLeuTrpGlu LysTyrGlyLeuArgPro                             165170175                                                                     AspGlnTrpAlaAspTyrArgAlaLeuThrGlyAspGluSerAspAsn                              180185 190                                                                    LeuProGlyValLysGlyIleGlyGluLysThrAlaArgLysLeuLeu                              195200205                                                                     GluGluTrpGlySerLeuGluAlaLeuLeuLysAsnLeuAspArgLeu                               210215220                                                                    LysProAlaIleArgGluLysIleLeuAlaHisMetAspAspLeuLys                              225230235240                                                                  LeuSerTrpAsp LeuAlaLysValArgThrAspLeuProLeuGluVal                             245250255                                                                     AspPheAlaLysArgArgGluProAspArgGluArgLeuArgAlaPhe                              260 265270                                                                    LeuGluArgLeuGluPheGlySerLeuLeuHisGluPheGlyLeuLeu                              275280285                                                                     GluSerProLysAlaLeuGluGluAlaPro TrpProProProGluGly                             290295300                                                                     AlaPheValGlyPheValLeuSerArgLysGluProMetTrpAlaAsp                              305310315 320                                                                 LeuLeuAlaLeuAlaAlaAlaArgGlyGlyArgValHisArgAlaPro                              325330335                                                                     GluProTyrLysAlaLeuArgAspLeuLysGluAlaArgGlyLeuL eu                             340345350                                                                     AlaLysAspLeuSerValLeuAlaLeuArgGluGlyLeuGlyLeuPro                              355360365                                                                     ProGlyAspAsp ProMetLeuLeuAlaTyrLeuLeuAspProSerAsn                             370375380                                                                     ThrThrProGluGlyValAlaArgArgTyrGlyGlyGluTrpThrGlu                              385390 395400                                                                 GluAlaGlyGluArgAlaAlaLeuSerGluArgLeuPheAlaAsnLeu                              405410415                                                                     TrpGlyArgLeuGluGlyGluGluArg LeuLeuTrpLeuTyrArgGlu                             420425430                                                                     ValGluArgProLeuSerAlaValLeuAlaHisMetGluAlaThrGly                              435440 445                                                                    ValArgLeuAspValAlaTyrLeuArgAlaLeuSerLeuGluValAla                              450455460                                                                     GluGluIleAlaArgLeuGluAlaGluValPheArgLeuAlaGlyHis                              465 470475480                                                                 ProPheAsnLeuAsnSerArgAspGlnLeuGluArgValLeuPheAsp                              485490495                                                                     GluLeuGly LeuProAlaIleGlyLysThrGluLysThrGlyLysArg                             500505510                                                                     SerThrSerAlaAlaValLeuGluAlaLeuArgGluAlaHisProIle                              515 520525                                                                    ValGluLysIleLeuGlnTyrArgGluLeuThrLysLeuLysSerThr                              530535540                                                                     TyrIleAspProLeuProAspLeuIleHisProArg ThrGlyArgLeu                             545550555560                                                                  HisThrArgPheAsnGlnThrAlaThrAlaThrGlyArgLeuSerSer                              565570 575                                                                    SerAspProAsnLeuGlnAsnIleProValArgThrProLeuGlyGln                              580585590                                                                     ArgIleArgArgAlaPheIleAlaGluGluGlyTrpLeuLeuValA la                             595600605                                                                     LeuAspTyrSerGlnIleGluLeuArgValLeuAlaHisLeuSerGly                              610615620                                                                     AspGluAsnLeuIleArg ValPheGlnGluGlyArgAspIleHisThr                             625630635640                                                                  GluThrAlaSerTrpMetPheGlyValProArgGluAlaValAspPro                              645 650655                                                                    LeuMetArgArgAlaAlaLysThrIleAsnPheGlyValLeuTyrGly                              660665670                                                                     MetSerAlaHisArgLeuSerGlnGlu LeuAlaIleProTyrGluGlu                             675680685                                                                     AlaGlnAlaPheIleGluArgTyrPheGlnSerPheProLysValArg                              690695700                                                                     AlaTrpIleGluLysThrLeuGluGluGlyArgArgArgGlyTyrVal                              705710715720                                                                  GluThrLeuPheGlyArgArgArgTyrValProAspLeuGluAlaArg                               725730735                                                                    ValLysSerValArgGluAlaAlaGluArgMetAlaPheAsnMetPro                              740745750                                                                     ValGlnGly ThrAlaAlaAspLeuMetLysLeuAlaMetValLysLeu                             755760765                                                                     PheProArgLeuGluGluMetGlyAlaArgMetLeuLeuGlnValHis                              770 775780                                                                    AspGluLeuValLeuGluAlaProLysGluArgAlaGluAlaValAla                              785790795800                                                                  ArgLeuAlaLysGluValMetGluGlyVal TyrProLeuAlaValPro                             805810815                                                                     LeuGluValGluValGlyIleGlyGluAspTrpLeuSerAlaLysGlu                              820825 830                                                                    (2) INFORMATION FOR SEQ ID NO:3:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 2682 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: Thermotoga maritima                                             (ix ) FEATURE:                                                                (A) NAME/KEY: CDS                                                             (B) LOCATION: 1..2679                                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                       ATGGCGAGACTATTTCTCTTTGATGGAACTGCTCTGGCCTACAGAGCG48                            MetAlaArgLeuPheLeuPheAspGlyThrAlaLeuAlaTyrArgAla                               151015                                                                       TACTATGCGCTCGATAGATCGCTTTCTACTTCCACCGGCATTCCCACA96                            TyrTyrAlaLeuAspArgSerLeuSerThrSerThrGlyIleProTh r                             202530                                                                        AACGCCACATACGGTGTGGCGAGGATGCTGGTGAGATTCATCAAAGAC144                           AsnAlaThrTyrGlyValAlaArgMetLeuValArgPheIleLysAs p                             354045                                                                        CATATCATTGTCGGAAAAGACTACGTTGCTGTGGCTTTCGACAAAAAA192                           HisIleIleValGlyLysAspTyrValAlaValAlaPheAspLysLys                               505560                                                                       GCTGCCACCTTCAGACACAAGCTCCTCGAGACTTACAAGGCTCAAAGA240                           AlaAlaThrPheArgHisLysLeuLeuGluThrTyrLysAlaGlnArg                              65 707580                                                                     CCAAAGACTCCGGATCTCCTGATTCAGCAGCTTCCGTACATAAAGAAG288                           ProLysThrProAspLeuLeuIleGlnGlnLeuProTyrIleLysLys                               859095                                                                       CTGGTCGAAGCCCTTGGAATGAAAGTGCTGGAGGTAGAAGGATACGAA336                           LeuValGluAlaLeuGlyMetLysValLeuGluValGluGlyTyrGl u                             100105110                                                                     GCGGACGATATAATTGCCACTCTGGCTGTGAAGGGGCTTCCGCTTTTT384                           AlaAspAspIleIleAlaThrLeuAlaValLysGlyLeuProLeuPh e                             115120125                                                                     GATGAAATATTCATAGTGACCGGAGATAAAGACATGCTTCAGCTTGTG432                           AspGluIlePheIleValThrGlyAspLysAspMetLeuGlnLeuVal                               130135140                                                                    AACGAAAAGATCAAGGTGTGGCGAATCGTAAAAGGGATATCCGATCTG480                           AsnGluLysIleLysValTrpArgIleValLysGlyIleSerAspLeu                              145 150155160                                                                 GAACTTTACGATGCGCAGAAGGTGAAGGAAAAATACGGTGTTGAACCC528                           GluLeuTyrAspAlaGlnLysValLysGluLysTyrGlyValGluPro                               165170175                                                                    CAGCAGATCCCGGATCTTCTGGCTCTAACCGGAGATGAAATAGACAAC576                           GlnGlnIleProAspLeuLeuAlaLeuThrGlyAspGluIleAspAs n                             180185190                                                                     ATCCCCGGTGTAACTGGGATAGGTGAAAAGACTGCTGTTCAGCTTCTA624                           IleProGlyValThrGlyIleGlyGluLysThrAlaValGlnLeuLe u                             195200205                                                                     GAGAAGTACAAAGACCTCGAAGACATACTGAATCATGTTCGCGAACTT672                           GluLysTyrLysAspLeuGluAspIleLeuAsnHisValArgGluLeu                               210215220                                                                    CCTCAAAAGGTGAGAAAAGCCCTGCTTCGAGACAGAGAAAACGCCATT720                           ProGlnLysValArgLysAlaLeuLeuArgAspArgGluAsnAlaIle                              225 230235240                                                                 CTCAGCAAAAAGCTGGCGATTCTGGAAACAAACGTTCCCATTGAAATA768                           LeuSerLysLysLeuAlaIleLeuGluThrAsnValProIleGluIle                               245250255                                                                    AACTGGGAAGAACTTCGCTACCAGGGCTACGACAGAGAGAAACTCTTA816                           AsnTrpGluGluLeuArgTyrGlnGlyTyrAspArgGluLysLeuLe u                             260265270                                                                     CCACTTTTGAAAGAACTGGAATTCGCATCCATCATGAAGGAACTTCAA864                           ProLeuLeuLysGluLeuGluPheAlaSerIleMetLysGluLeuGl n                             275280285                                                                     CTGTACGAAGAGTCCGAACCCGTTGGATACAGAATAGTGAAAGACCTA912                           LeuTyrGluGluSerGluProValGlyTyrArgIleValLysAspLeu                               290295300                                                                    GTGGAATTTGAAAAACTCATAGAGAAACTGAGAGAATCCCCTTCGTTC960                           ValGluPheGluLysLeuIleGluLysLeuArgGluSerProSerPhe                              305 310315320                                                                 GCCATAGATCTTGAGACGTCTTCCCTCGATCCTTTCGACTGCGACATT1008                          AlaIleAspLeuGluThrSerSerLeuAspProPheAspCysAspIle                               325330335                                                                    GTCGGTATCTCTGTGTCTTTCAAACCAAAGGAAGCGTACTACATACCA1056                          ValGlyIleSerValSerPheLysProLysGluAlaTyrTyrIlePr o                             340345350                                                                     CTCCATCATAGAAACGCCCAGAACCTGGACGAAAAAGAGGTTCTGAAA1104                          LeuHisHisArgAsnAlaGlnAsnLeuAspGluLysGluValLeuLy s                             355360365                                                                     AAGCTCAAAGAAATTCTGGAGGACCCCGGAGCAAAGATCGTTGGTCAG1152                          LysLeuLysGluIleLeuGluAspProGlyAlaLysIleValGlyGln                               370375380                                                                    AATTTGAAATTCGATTACAAGGTGTTGATGGTGAAGGGTGTTGAACCT1200                          AsnLeuLysPheAspTyrLysValLeuMetValLysGlyValGluPro                              385 390395400                                                                 GTTCCTCCTTACTTCGACACGATGATAGCGGCTTACCTTCTTGAGCCG1248                          ValProProTyrPheAspThrMetIleAlaAlaTyrLeuLeuGluPro                               405410415                                                                    AACGAAAAGAAGTTCAATCTGGACGATCTCGCATTGAAATTTCTTGGA1296                          AsnGluLysLysPheAsnLeuAspAspLeuAlaLeuLysPheLeuGl y                             420425430                                                                     TACAAAATGACATCTTACCAAGAGCTCATGTCCTTCTCTTTTCCGCTG1344                          TyrLysMetThrSerTyrGlnGluLeuMetSerPheSerPheProLe u                             435440445                                                                     TTTGGTTTCAGTTTTGCCGATGTTCCTGTAGAAAAAGCAGCGAACTAC1392                          PheGlyPheSerPheAlaAspValProValGluLysAlaAlaAsnTyr                               450455460                                                                    TCCTGTGAAGATGCAGACATCACCTACAGACTTTACAAGACCCTGAGC1440                          SerCysGluAspAlaAspIleThrTyrArgLeuTyrLysThrLeuSer                              465 470475480                                                                 TTAAAACTCCACGAGGCAGATCTGGAAAACGTGTTCTACAAGATAGAA1488                          LeuLysLeuHisGluAlaAspLeuGluAsnValPheTyrLysIleGlu                               485490495                                                                    ATGCCCCTTGTGAACGTGCTTGCACGGATGGAACTGAACGGTGTGTAT1536                          MetProLeuValAsnValLeuAlaArgMetGluLeuAsnGlyValTy r                             500505510                                                                     GTGGACACAGAGTTCCTGAAGAAACTCTCAGAAGAGTACGGAAAAAAA1584                          ValAspThrGluPheLeuLysLysLeuSerGluGluTyrGlyLysLy s                             515520525                                                                     CTCGAAGAACTGGCAGAGGAAATATACAGGATAGCTGGAGAGCCGTTC1632                          LeuGluGluLeuAlaGluGluIleTyrArgIleAlaGlyGluProPhe                               530535540                                                                    AACATAAACTCACCGAAGCAGGTTTCAAGGATCCTTTTTGAAAAACTC1680                          AsnIleAsnSerProLysGlnValSerArgIleLeuPheGluLysLeu                              545 550555560                                                                 GGCATAAAACCACGTGGTAAAACGACGAAAACGGGAGACTATTCAACA1728                          GlyIleLysProArgGlyLysThrThrLysThrGlyAspTyrSerThr                               565570575                                                                    CGCATAGAAGTCCTCGAGGAACTTGCCGGTGAACACGAAATCATTCCT1776                          ArgIleGluValLeuGluGluLeuAlaGlyGluHisGluIleIlePr o                             580585590                                                                     CTGATTCTTGAATACAGAAAGATACAGAAATTGAAATCAACCTACATA1824                          LeuIleLeuGluTyrArgLysIleGlnLysLeuLysSerThrTyrIl e                             595600605                                                                     GACGCTCTTCCCAAGATGGTCAACCCAAAGACCGGAAGGATTCATGCT1872                          AspAlaLeuProLysMetValAsnProLysThrGlyArgIleHisAla                               610615620                                                                    TCTTTCAATCAAACGGGGACTGCCACTGGAAGACTTAGCAGCAGCGAT1920                          SerPheAsnGlnThrGlyThrAlaThrGlyArgLeuSerSerSerAsp                              625 630635640                                                                 CCCAATCTTCAGAACCTCCCGACGAAAAGTGAAGAGGGAAAAGAAATC1968                          ProAsnLeuGlnAsnLeuProThrLysSerGluGluGlyLysGluIle                               645650655                                                                    AGGAAAGCGATAGTTCCTCAGGATCCAAACTGGTGGATCGTCAGTGCC2016                          ArgLysAlaIleValProGlnAspProAsnTrpTrpIleValSerAl a                             660665670                                                                     GACTACTCCCAAATAGAACTGAGGATCCTCGCCCATCTCAGTGGTGAT2064                          AspTyrSerGlnIleGluLeuArgIleLeuAlaHisLeuSerGlyAs p                             675680685                                                                     GAGAATCTTTTGAGGGCATTCGAAGAGGGCATCGACGTCCACACTCTA2112                          GluAsnLeuLeuArgAlaPheGluGluGlyIleAspValHisThrLeu                               690695700                                                                    ACAGCTTCCAGAATATTCAACGTGAAACCCGAAGAAGTAACCGAAGAA2160                          ThrAlaSerArgIlePheAsnValLysProGluGluValThrGluGlu                              705 710715720                                                                 ATGCGCCGCGCTGGTAAAATGGTTAATTTTTCCATCATATACGGTGTA2208                          MetArgArgAlaGlyLysMetValAsnPheSerIleIleTyrGlyVal                               725730735                                                                    ACACCTTACGGTCTGTCTGTGAGGCTTGGAGTACCTGTGAAAGAAGCA2256                          ThrProTyrGlyLeuSerValArgLeuGlyValProValLysGluAl a                             740745750                                                                     GAAAAGATGATCGTCAACTACTTCGTCCTCTACCCAAAGGTGCGCGAT2304                          GluLysMetIleValAsnTyrPheValLeuTyrProLysValArgAs p                             755760765                                                                     TACATTCAGAGGGTCGTATCGGAAGCGAAAGAAAAAGGCTATGTTAGA2352                          TyrIleGlnArgValValSerGluAlaLysGluLysGlyTyrValArg                               770775780                                                                    ACGCTGTTTGGAAGAAAAAGAGACATACCACAGCTCATGGCCCGGGAC2400                          ThrLeuPheGlyArgLysArgAspIleProGlnLeuMetAlaArgAsp                              785 790795800                                                                 AGGAACACACAGGCTGAAGGAGAACGAATTGCCATAAACACTCCCATA2448                          ArgAsnThrGlnAlaGluGlyGluArgIleAlaIleAsnThrProIle                               805810815                                                                    CAGGGTACAGCAGCGGATATAATAAAGCTGGCTATGATAGAAATAGAC2496                          GlnGlyThrAlaAlaAspIleIleLysLeuAlaMetIleGluIleAs p                             820825830                                                                     AGGGAACTGAAAGAAAGAAAAATGAGATCGAAGATGATCATACAGGTC2544                          ArgGluLeuLysGluArgLysMetArgSerLysMetIleIleGlnVa l                             835840845                                                                     CACGACGAACTGGTTTTTGAAGTGCCCAATGAGGAAAAGGACGCGCTC2592                          HisAspGluLeuValPheGluValProAsnGluGluLysAspAlaLeu                               850855860                                                                    GTCGAGCTGGTGAAAGACAGAATGACGAATGTGGTAAAGCTTTCAGTG2640                          ValGluLeuValLysAspArgMetThrAsnValValLysLeuSerVal                              865 870875880                                                                 CCGCTCGAAGTGGATGTAACCATCGGCAAAACATGGTCGTGA2682                                ProLeuGluValAspValThrIleGlyLysThrTrpSer                                        885890                                                                       (2) INFORMATION FOR SEQ ID NO:4:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 893 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                       MetAlaArgLeuPheLeuPheAspGlyThrAlaLeuAlaT yrArgAla                             151015                                                                        TyrTyrAlaLeuAspArgSerLeuSerThrSerThrGlyIleProThr                              202530                                                                         AsnAlaThrTyrGlyValAlaArgMetLeuValArgPheIleLysAsp                             354045                                                                        HisIleIleValGlyLysAspTyrValAlaValAlaPheAspLysLys                              50 5560                                                                       AlaAlaThrPheArgHisLysLeuLeuGluThrTyrLysAlaGlnArg                              65707580                                                                      ProLysThrProAspLeuLeu IleGlnGlnLeuProTyrIleLysLys                             859095                                                                        LeuValGluAlaLeuGlyMetLysValLeuGluValGluGlyTyrGlu                              1001 05110                                                                    AlaAspAspIleIleAlaThrLeuAlaValLysGlyLeuProLeuPhe                              115120125                                                                     AspGluIlePheIleValThrGlyAspLysAspMetLeuG lnLeuVal                             130135140                                                                     AsnGluLysIleLysValTrpArgIleValLysGlyIleSerAspLeu                              145150155160                                                                  Glu LeuTyrAspAlaGlnLysValLysGluLysTyrGlyValGluPro                             165170175                                                                     GlnGlnIleProAspLeuLeuAlaLeuThrGlyAspGluIleAspAsn                               180185190                                                                    IleProGlyValThrGlyIleGlyGluLysThrAlaValGlnLeuLeu                              195200205                                                                     GluLysTyrLysAspLeuGlu AspIleLeuAsnHisValArgGluLeu                             210215220                                                                     ProGlnLysValArgLysAlaLeuLeuArgAspArgGluAsnAlaIle                              225230235 240                                                                 LeuSerLysLysLeuAlaIleLeuGluThrAsnValProIleGluIle                              245250255                                                                     AsnTrpGluGluLeuArgTyrGlnGlyTyrAspArgG luLysLeuLeu                             260265270                                                                     ProLeuLeuLysGluLeuGluPheAlaSerIleMetLysGluLeuGln                              275280285                                                                     Leu TyrGluGluSerGluProValGlyTyrArgIleValLysAspLeu                             290295300                                                                     ValGluPheGluLysLeuIleGluLysLeuArgGluSerProSerPhe                              305 310315320                                                                 AlaIleAspLeuGluThrSerSerLeuAspProPheAspCysAspIle                              325330335                                                                     ValGlyIleSerValSer PheLysProLysGluAlaTyrTyrIlePro                             340345350                                                                     LeuHisHisArgAsnAlaGlnAsnLeuAspGluLysGluValLeuLys                              355360 365                                                                    LysLeuLysGluIleLeuGluAspProGlyAlaLysIleValGlyGln                              370375380                                                                     AsnLeuLysPheAspTyrLysValLeuMetValLysGlyValGluP ro                             385390395400                                                                  ValProProTyrPheAspThrMetIleAlaAlaTyrLeuLeuGluPro                              405410415                                                                      AsnGluLysLysPheAsnLeuAspAspLeuAlaLeuLysPheLeuGly                             420425430                                                                     TyrLysMetThrSerTyrGlnGluLeuMetSerPheSerPheProLeu                               435440445                                                                    PheGlyPheSerPheAlaAspValProValGluLysAlaAlaAsnTyr                              450455460                                                                     SerCysGluAspAlaAspIleThrTyr ArgLeuTyrLysThrLeuSer                             465470475480                                                                  LeuLysLeuHisGluAlaAspLeuGluAsnValPheTyrLysIleGlu                              4854 90495                                                                    MetProLeuValAsnValLeuAlaArgMetGluLeuAsnGlyValTyr                              500505510                                                                     ValAspThrGluPheLeuLysLysLeuSerGluGluT yrGlyLysLys                             515520525                                                                     LeuGluGluLeuAlaGluGluIleTyrArgIleAlaGlyGluProPhe                              530535540                                                                     AsnIleAsn SerProLysGlnValSerArgIleLeuPheGluLysLeu                             545550555560                                                                  GlyIleLysProArgGlyLysThrThrLysThrGlyAspTyrSerThr                               565570575                                                                    ArgIleGluValLeuGluGluLeuAlaGlyGluHisGluIleIlePro                              580585590                                                                     LeuIleLeuGluTyrArg LysIleGlnLysLeuLysSerThrTyrIle                             595600605                                                                     AspAlaLeuProLysMetValAsnProLysThrGlyArgIleHisAla                              610615 620                                                                    SerPheAsnGlnThrGlyThrAlaThrGlyArgLeuSerSerSerAsp                              625630635640                                                                  ProAsnLeuGlnAsnLeuProThrLysSerGluGluGlyL ysGluIle                             645650655                                                                     ArgLysAlaIleValProGlnAspProAsnTrpTrpIleValSerAla                              660665670                                                                      AspTyrSerGlnIleGluLeuArgIleLeuAlaHisLeuSerGlyAsp                             675680685                                                                     GluAsnLeuLeuArgAlaPheGluGluGlyIleAspValHisThrLeu                              690 695700                                                                    ThrAlaSerArgIlePheAsnValLysProGluGluValThrGluGlu                              705710715720                                                                  MetArgArgAlaGlyLysMet ValAsnPheSerIleIleTyrGlyVal                             725730735                                                                     ThrProTyrGlyLeuSerValArgLeuGlyValProValLysGluAla                              7407 45750                                                                    GluLysMetIleValAsnTyrPheValLeuTyrProLysValArgAsp                              755760765                                                                     TyrIleGlnArgValValSerGluAlaLysGluLysGlyT yrValArg                             770775780                                                                     ThrLeuPheGlyArgLysArgAspIleProGlnLeuMetAlaArgAsp                              785790795800                                                                  Arg AsnThrGlnAlaGluGlyGluArgIleAlaIleAsnThrProIle                             805810815                                                                     GlnGlyThrAlaAlaAspIleIleLysLeuAlaMetIleGluIleAsp                               820825830                                                                    ArgGluLeuLysGluArgLysMetArgSerLysMetIleIleGlnVal                              835840845                                                                     HisAspGluLeuValPheGlu ValProAsnGluGluLysAspAlaLeu                             850855860                                                                     ValGluLeuValLysAspArgMetThrAsnValValLysLeuSerVal                              865870875 880                                                                 ProLeuGluValAspValThrIleGlyLysThrTrpSer                                       885890                                                                        (2) INFORMATION FOR SEQ ID NO:5:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 2493 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: Thermus species sps17                                           (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 1..2490                                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                       ATGCTGCCCCTCTTTGAGCCCAAGGGC CGGGTCCTCCTGGTGGACGGC48                           MetLeuProLeuPheGluProLysGlyArgValLeuLeuValAspGly                              151015                                                                        CACCACCTGGCCTACCGCACCTTT TTCGCCCTCAAGGGCCTCACCACC96                           HisHisLeuAlaTyrArgThrPhePheAlaLeuLysGlyLeuThrThr                              202530                                                                        AGCCGGGGCGAGCCCGTGCAGGCG GTTTATGGCTTCGCCAAAAGCCTC144                          SerArgGlyGluProValGlnAlaValTyrGlyPheAlaLysSerLeu                              354045                                                                        CTCAAGGCCCTGAAGGAGGATGGGGAG GTGGCCATCGTGGTCTTTGAC192                          LeuLysAlaLeuLysGluAspGlyGluValAlaIleValValPheAsp                              505560                                                                        GCCAAGGCCCCCTCCTTCCGCCACGAGGCCTAC GAGGCCTACAAGGCG240                          AlaLysAlaProSerPheArgHisGluAlaTyrGluAlaTyrLysAla                              65707580                                                                      GGCCGGGCCCCCACCCCGGAGGACTTT CCCCGGCAGCTCGCCCTCATC288                          GlyArgAlaProThrProGluAspPheProArgGlnLeuAlaLeuIle                              859095                                                                        AAGGAGCTGGTGGACCTTTTGGGC CTCGTGCGCCTTGAGGTCCCGGGC336                          LysGluLeuValAspLeuLeuGlyLeuValArgLeuGluValProGly                              100105110                                                                     TTTGAGGCGGACGATGTCCTCGCC ACCCTGGCCAAGAAGGCAGAAAGG384                          PheGluAlaAspAspValLeuAlaThrLeuAlaLysLysAlaGluArg                              115120125                                                                     GAGGGGTACGAGGTGCGCATCCTGAGC GCGGACCGCGACCTCTACCAG432                          GluGlyTyrGluValArgIleLeuSerAlaAspArgAspLeuTyrGln                              130135140                                                                     CTCCTTTCCGACCGGATCCACCTCCTCCACCCC GAGGGGGAGGTCCTG480                          LeuLeuSerAspArgIleHisLeuLeuHisProGluGlyGluValLeu                              145150155160                                                                  ACCCCCGGGTGGCTCCAGGAGCGCTAC GGCCTCTCCCCGGAGAGGTGG528                          ThrProGlyTrpLeuGlnGluArgTyrGlyLeuSerProGluArgTrp                              165170175                                                                     GTGGAGTACCGGGCCCTGGTGGGG GACCCTTCGGACAACCTCCCCGGG576                          ValGluTyrArgAlaLeuValGlyAspProSerAspAsnLeuProGly                              180185190                                                                     GTGCCCGGCATCGGGGAGAAGACC GCCCTGAAGCTCCTGAAGGAGTGG624                          ValProGlyIleGlyGluLysThrAlaLeuLysLeuLeuLysGluTrp                              195200205                                                                     GGTAGCCTGGAAGCGATTCTAAAGAAC CTGGACCAGGTGAAGCCGGAA672                          GlySerLeuGluAlaIleLeuLysAsnLeuAspGlnValLysProGlu                              210215220                                                                     AGGGTGCGGGAGGCCATCCGGAATAACCTGGAT AAGCTCCAGATGTCC720                          ArgValArgGluAlaIleArgAsnAsnLeuAspLysLeuGlnMetSer                              225230235240                                                                  CTGGAGCTTTCCCGCCTCCGCACCGAC CTCCCCCTGGAGGTGGACTTC768                          LeuGluLeuSerArgLeuArgThrAspLeuProLeuGluValAspPhe                              245250255                                                                     GCCAAGAGGCGGGAGCCCGACTGG GAGGGGCTTAAGGCCTTTTTGGAG816                          AlaLysArgArgGluProAspTrpGluGlyLeuLysAlaPheLeuGlu                              260265270                                                                     CGGCTTGAGTTCGGAAGCCTCCTC CACGAGTTCGGCCTTCTGGAGGCC864                          ArgLeuGluPheGlySerLeuLeuHisGluPheGlyLeuLeuGluAla                              275280285                                                                     CCCAAGGAGGCGGAGGAGGCCCCCTGG CCCCCGCCTGGAGGGGCCTTT912                          ProLysGluAlaGluGluAlaProTrpProProProGlyGlyAlaPhe                              290295300                                                                     TTGGGCTTCCTCCTCTCCCGCCCCGAGCCCATG TGGGCGGAGCTTTTG960                          LeuGlyPheLeuLeuSerArgProGluProMetTrpAlaGluLeuLeu                              305310315320                                                                  GCCCTGGCGGGGGCCAAGGAGGGGCGG GTCCATCGGGCGGAAGACCCC1008                         AlaLeuAlaGlyAlaLysGluGlyArgValHisArgAlaGluAspPro                              325330335                                                                     GTGGGGGCCCTAAAGGACCTGAAG GAGATCCGGGGCCTCCTCGCCAAG1056                         ValGlyAlaLeuLysAspLeuLysGluIleArgGlyLeuLeuAlaLys                              340345350                                                                     GACCTCTCGGTCCTGGCCCTGAGG GAGGGCCGGGAGATCCCGCCGGGG1104                         AspLeuSerValLeuAlaLeuArgGluGlyArgGluIleProProGly                              355360365                                                                     GACGACCCCATGCTCCTCGCCTACCTC CTGGACCCGGGGAACACCAAC1152                         AspAspProMetLeuLeuAlaTyrLeuLeuAspProGlyAsnThrAsn                              370375380                                                                     CCCGAGGGGGTGGCCCGGCGGTACGGGGGGGAG TGGAAGGAGGACGCC1200                         ProGluGlyValAlaArgArgTyrGlyGlyGluTrpLysGluAspAla                              385390395400                                                                  GCCGCCCGGGCCCTCCTTTCGGAAAGG CTCTGGCAGGCCCTTTACCCC1248                         AlaAlaArgAlaLeuLeuSerGluArgLeuTrpGlnAlaLeuTyrPro                              405410415                                                                     CGGGTGGCGGAGGAGGAAAGGCTC CTTTGGCTCTACCGGGAGGTGGAG1296                         ArgValAlaGluGluGluArgLeuLeuTrpLeuTyrArgGluValGlu                              420425430                                                                     CGGCCCCTCGCCCAGGTCCTCGCC CACATGGAGGCCACGGGGGTGCGG1344                         ArgProLeuAlaGlnValLeuAlaHisMetGluAlaThrGlyValArg                              435440445                                                                     CTGGATGTGCCCTACCTGGAGGCCCTT TCCCAGGAGGTGGCCTTTGAG1392                         LeuAspValProTyrLeuGluAlaLeuSerGlnGluValAlaPheGlu                              450455460                                                                     CTGGAGCGCCTCGAGGCCGAGGTCCACCGCCTG GCGGGCCACCCCTTC1440                         LeuGluArgLeuGluAlaGluValHisArgLeuAlaGlyHisProPhe                              465470475480                                                                  AACCTGAACTCTAGGGACCAGCTGGAG CGGGTCCTCTTTGACGAGCTC1488                         AsnLeuAsnSerArgAspGlnLeuGluArgValLeuPheAspGluLeu                              485490495                                                                     GGCCTACCCCCCATCGGCAAGACG GAGAAGACGGGCAAGCGCTCCACC1536                         GlyLeuProProIleGlyLysThrGluLysThrGlyLysArgSerThr                              500505510                                                                     AGCGCCGCCGTCCTGGAGCTCTTA AGGGAGGCCCACCCCATCGTGGGG1584                         SerAlaAlaValLeuGluLeuLeuArgGluAlaHisProIleValGly                              515520525                                                                     CGGATCCTGGAGTACCGGGAGCTCATG AAGCTCAAGAGCACCTACATA1632                         ArgIleLeuGluTyrArgGluLeuMetLysLeuLysSerThrTyrIle                              530535540                                                                     GACCCCCTCCCCAGGCTGGTCCACCCCAAAACC GGCCGGCTCCACACC1680                         AspProLeuProArgLeuValHisProLysThrGlyArgLeuHisThr                              545550555560                                                                  CGCTTCAACCAGACGGCCACCGCCACG GGCCGCCTCTCCAGCTCCGAC1728                         ArgPheAsnGlnThrAlaThrAlaThrGlyArgLeuSerSerSerAsp                              565570575                                                                     CCCAACCTGCAGAACATCCCCGTG CGCACCCCCTTAGGCCAGCGCATC1776                         ProAsnLeuGlnAsnIleProValArgThrProLeuGlyGlnArgIle                              580585590                                                                     CGCAAGGCCTTCATTGCCGAGGAG GGCCATCTCCTGGTGGCCCTGGAC1824                         ArgLysAlaPheIleAlaGluGluGlyHisLeuLeuValAlaLeuAsp                              595600605                                                                     TATAGCCAGATCGAGCTCCGGGTCCTC GCCCACCTCTCGGGGGACGAG1872                         TyrSerGlnIleGluLeuArgValLeuAlaHisLeuSerGlyAspGlu                              610615620                                                                     AACCTCATCCGGGTCTTCCGGGAAGGGAAGGAC ATCCACACCGAGACC1920                         AsnLeuIleArgValPheArgGluGlyLysAspIleHisThrGluThr                              625630635640                                                                  GCCGCCTGGATGTTCGGCGTGCCCCCC GAGGGGGTGGACGGGGCCATG1968                         AlaAlaTrpMetPheGlyValProProGluGlyValAspGlyAlaMet                              645650655                                                                     CGCCGGGCGGCCAAGACGGTGAAC TTCGGGGTGCTCTACGGGATGTCC2016                         ArgArgAlaAlaLysThrValAsnPheGlyValLeuTyrGlyMetSer                              660665670                                                                     GCCCACCGCCTCTCCCAGGAGCTC TCCATCCCCTACGAGGAGGCGGCG2064                         AlaHisArgLeuSerGlnGluLeuSerIleProTyrGluGluAlaAla                              675680685                                                                     GCCTTCATCGAGCGCTACTTCCAGAGC TTCCCCAAGGTGCGGGCCTGG2112                         AlaPheIleGluArgTyrPheGlnSerPheProLysValArgAlaTrp                              690695700                                                                     ATCGCCAAAACCTTGGAGGAGGGGCGGAAGAAG GGGTACGTGGAGACC2160                         IleAlaLysThrLeuGluGluGlyArgLysLysGlyTyrValGluThr                              705710715720                                                                  CTCTTCGGCCGCCGCCGCTACGTGCCC GACCTCAACGCCCGGGTGAAG2208                         LeuPheGlyArgArgArgTyrValProAspLeuAsnAlaArgValLys                              725730735                                                                     AGCGTGCGGGAGGCGGCGGAGCGC ATGGCCTTCAACATGCCCGTGCAG2256                         SerValArgGluAlaAlaGluArgMetAlaPheAsnMetProValGln                              740745750                                                                     GGCACCGCCGCGGACCTCATGAAG CTGGCCATGGTGAAGCTCTTCCCC2304                         GlyThrAlaAlaAspLeuMetLysLeuAlaMetValLysLeuPhePro                              755760765                                                                     AGGCTCAGGCCCTTGGGCGTTCGCATC CTCCTCCAGGTGCACGACGAG2352                         ArgLeuArgProLeuGlyValArgIleLeuLeuGlnValHisAspGlu                              770775780                                                                     CTGGTCTTGGAGGCCCCAAAGGCGCGGGCGGAG GAGGCCGCCCAGTTG2400                         LeuValLeuGluAlaProLysAlaArgAlaGluGluAlaAlaGlnLeu                              785790795800                                                                  GCCAAGGAGACCATGGAAGGGGTTTAC CCCCTCTCCGTCCCCCTGGAG2448                         AlaLysGluThrMetGluGlyValTyrProLeuSerValProLeuGlu                              805810815                                                                     GTGGAGGTGGGGATGGGGGAGGAC TGGCTTTCCGCCAAGGCC2490                               ValGluValGlyMetGlyGluAspTrpLeuSerAlaLysAla                                    820825830                                                                     TAG 2493                                                                      (2) INFORMATION FOR SEQ ID NO:6:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 830 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                       MetLeuProLeuPheGluProLysGlyArgValLeu LeuValAspGly                             151015                                                                        HisHisLeuAlaTyrArgThrPhePheAlaLeuLysGlyLeuThrThr                              2025 30                                                                       SerArgGlyGluProValGlnAlaValTyrGlyPheAlaLysSerLeu                              354045                                                                        LeuLysAlaLeuLysGluAspGlyGluValAlaIleValValPheAsp                              5 05560                                                                       AlaLysAlaProSerPheArgHisGluAlaTyrGluAlaTyrLysAla                              65707580                                                                      GlyArgAlaProThrPr oGluAspPheProArgGlnLeuAlaLeuIle                             859095                                                                        LysGluLeuValAspLeuLeuGlyLeuValArgLeuGluValProGly                              100 105110                                                                    PheGluAlaAspAspValLeuAlaThrLeuAlaLysLysAlaGluArg                              115120125                                                                     GluGlyTyrGluValArgIleLeuSerAlaAspArg AspLeuTyrGln                             130135140                                                                     LeuLeuSerAspArgIleHisLeuLeuHisProGluGlyGluValLeu                              145150155160                                                                  ThrProGlyTrpLeuGlnGluArgTyrGlyLeuSerProGluArgTrp                              165170175                                                                     ValGluTyrArgAlaLeuValGlyAspProSerAspAsnLeuProGly                               180185190                                                                    ValProGlyIleGlyGluLysThrAlaLeuLysLeuLeuLysGluTrp                              195200205                                                                     GlySerLeuGluAlaIl eLeuLysAsnLeuAspGlnValLysProGlu                             210215220                                                                     ArgValArgGluAlaIleArgAsnAsnLeuAspLysLeuGlnMetSer                              225230 235240                                                                 LeuGluLeuSerArgLeuArgThrAspLeuProLeuGluValAspPhe                              245250255                                                                     AlaLysArgArgGluProAspTrpGluGlyLeu LysAlaPheLeuGlu                             260265270                                                                     ArgLeuGluPheGlySerLeuLeuHisGluPheGlyLeuLeuGluAla                              275280285                                                                     ProLysGluAlaGluGluAlaProTrpProProProGlyGlyAlaPhe                              290295300                                                                     LeuGlyPheLeuLeuSerArgProGluProMetTrpAlaGluLeuLeu                              305 310315320                                                                 AlaLeuAlaGlyAlaLysGluGlyArgValHisArgAlaGluAspPro                              325330335                                                                     ValGlyAlaLeuLy sAspLeuLysGluIleArgGlyLeuLeuAlaLys                             340345350                                                                     AspLeuSerValLeuAlaLeuArgGluGlyArgGluIleProProGly                              355 360365                                                                    AspAspProMetLeuLeuAlaTyrLeuLeuAspProGlyAsnThrAsn                              370375380                                                                     ProGluGlyValAlaArgArgTyrGlyGlyGluTrpLysGlu AspAla                             385390395400                                                                  AlaAlaArgAlaLeuLeuSerGluArgLeuTrpGlnAlaLeuTyrPro                              405410 415                                                                    ArgValAlaGluGluGluArgLeuLeuTrpLeuTyrArgGluValGlu                              420425430                                                                     ArgProLeuAlaGlnValLeuAlaHisMetGluAlaThrGlyValArg                               435440445                                                                    LeuAspValProTyrLeuGluAlaLeuSerGlnGluValAlaPheGlu                              450455460                                                                     LeuGluArgLeuGluAlaGluVa lHisArgLeuAlaGlyHisProPhe                             465470475480                                                                  AsnLeuAsnSerArgAspGlnLeuGluArgValLeuPheAspGluLeu                              485 490495                                                                    GlyLeuProProIleGlyLysThrGluLysThrGlyLysArgSerThr                              500505510                                                                     SerAlaAlaValLeuGluLeuLeuArgGluAla HisProIleValGly                             515520525                                                                     ArgIleLeuGluTyrArgGluLeuMetLysLeuLysSerThrTyrIle                              530535540                                                                     AspP roLeuProArgLeuValHisProLysThrGlyArgLeuHisThr                             545550555560                                                                  ArgPheAsnGlnThrAlaThrAlaThrGlyArgLeuSerSerSerAsp                               565570575                                                                    ProAsnLeuGlnAsnIleProValArgThrProLeuGlyGlnArgIle                              580585590                                                                     ArgLysAlaPheIl eAlaGluGluGlyHisLeuLeuValAlaLeuAsp                             595600605                                                                     TyrSerGlnIleGluLeuArgValLeuAlaHisLeuSerGlyAspGlu                              610615 620                                                                    AsnLeuIleArgValPheArgGluGlyLysAspIleHisThrGluThr                              625630635640                                                                  AlaAlaTrpMetPheGlyValProProGluGlyVal AspGlyAlaMet                             645650655                                                                     ArgArgAlaAlaLysThrValAsnPheGlyValLeuTyrGlyMetSer                              660665 670                                                                    AlaHisArgLeuSerGlnGluLeuSerIleProTyrGluGluAlaAla                              675680685                                                                     AlaPheIleGluArgTyrPheGlnSerPheProLysValArgAlaTrp                              69 0695700                                                                    IleAlaLysThrLeuGluGluGlyArgLysLysGlyTyrValGluThr                              705710715720                                                                  LeuPheGlyArgArgAr gTyrValProAspLeuAsnAlaArgValLys                             725730735                                                                     SerValArgGluAlaAlaGluArgMetAlaPheAsnMetProValGln                              740 745750                                                                    GlyThrAlaAlaAspLeuMetLysLeuAlaMetValLysLeuPhePro                              755760765                                                                     ArgLeuArgProLeuGlyValArgIleLeuLeuGln ValHisAspGlu                             770775780                                                                     LeuValLeuGluAlaProLysAlaArgAlaGluGluAlaAlaGlnLeu                              785790795800                                                                  AlaLysGluThrMetGluGlyValTyrProLeuSerValProLeuGlu                              805810815                                                                     ValGluValGlyMetGlyGluAspTrpLeuSerAlaLysAla                                     820825830                                                                    (2) INFORMATION FOR SEQ ID NO:7:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 2505 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           ( vi) ORIGINAL SOURCE:                                                        (A) ORGANISM: Thermus species Z05                                             (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 1..2502                                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                       ATGAAGGCGATGCTTCCGCTCTTTGAACCCAAAGGCCGGGTTCTCCTG48                            MetLysAlaMetLeuProLeuPh eGluProLysGlyArgValLeuLeu                             151015                                                                        GTGGACGGCCACCACCTGGCCTACCGCACCTTCTTCGCCCTAAAGGGC96                            ValAspGlyHisHisLeuAl aTyrArgThrPhePheAlaLeuLysGly                             202530                                                                        CTCACCACGAGCCGGGGCGAACCGGTGCAGGCGGTTTACGGCTTCGCC144                           LeuThrThrSerArgGlyGl uProValGlnAlaValTyrGlyPheAla                             354045                                                                        AAGAGCCTCCTCAAGGCCCTGAAGGAGGACGGGTACAAGGCCGTCTTC192                           LysSerLeuLeuLysAlaLeuLy sGluAspGlyTyrLysAlaValPhe                             505560                                                                        GTGGTCTTTGACGCCAAGGCCCCTTCCTTCCGCCACGAGGCCTACGAG240                           ValValPheAspAlaLysAlaProSerPh eArgHisGluAlaTyrGlu                             65707580                                                                      GCCTACAAGGCAGGCCGCGCCCCGACCCCCGAGGACTTCCCCCGGCAG288                           AlaTyrLysAlaGlyArgAlaPr oThrProGluAspPheProArgGln                             859095                                                                        CTCGCCCTCATCAAGGAGCTGGTGGACCTCCTGGGGTTTACTCGCCTC336                           LeuAlaLeuIleLysGluLe uValAspLeuLeuGlyPheThrArgLeu                             100105110                                                                     GAGGTTCCGGGCTTTGAGGCGGACGACGTCCTCGCCACCCTGGCCAAG384                           GluValProGlyPheGluAl aAspAspValLeuAlaThrLeuAlaLys                             115120125                                                                     AAGGCGGAAAGGGAGGGGTACGAGGTGCGCATCCTCACCGCCGACCGG432                           LysAlaGluArgGluGlyTyrGl uValArgIleLeuThrAlaAspArg                             130135140                                                                     GACCTTTACCAGCTCGTCTCCGACCGCGTCGCCGTCCTCCACCCCGAG480                           AspLeuTyrGlnLeuValSerAspArgVa lAlaValLeuHisProGlu                             145150155160                                                                  GGCCACCTCATCACCCCGGAGTGGCTTTGGGAGAAGTACGGCCTTAAG528                           GlyHisLeuIleThrProGluTr pLeuTrpGluLysTyrGlyLeuLys                             165170175                                                                     CCGGAGCAGTGGGTGGACTTCCGCGCCCTCGTGGGGGACCCCTCCGAC576                           ProGluGlnTrpValAspPh eArgAlaLeuValGlyAspProSerAsp                             180185190                                                                     AACCTCCCCGGGGTCAAGGGCATCGGGGAGAAGACCGCCCTCAAGCTC624                           AsnLeuProGlyValLysGl yIleGlyGluLysThrAlaLeuLysLeu                             195200205                                                                     CTCAAGGAGTGGGGAAGCCTGGAAAATATCCTCAAGAACCTGGACCGG672                           LeuLysGluTrpGlySerLeuGl uAsnIleLeuLysAsnLeuAspArg                             210215220                                                                     GTGAAGCCGGAAAGCGTCCGGGAAAGGATCAAGGCCCACCTGGAAGAC720                           ValLysProGluSerValArgGluArgIl eLysAlaHisLeuGluAsp                             225230235240                                                                  CTTAAGCTCTCCTTGGAGCTTTCCCGGGTGCGCTCGGACCTCCCCCTG768                           LeuLysLeuSerLeuGluLeuSe rArgValArgSerAspLeuProLeu                             245250255                                                                     GAGGTGGACTTCGCCCGGAGGCGGGAGCCTGACCGGGAAGGGCTTCGG816                           GluValAspPheAlaArgAr gArgGluProAspArgGluGlyLeuArg                             260265270                                                                     GCCTTTTTGGAGCGCTTGGAGTTCGGCAGCCTCCTCCACGAGTTCGGC864                           AlaPheLeuGluArgLeuGl uPheGlySerLeuLeuHisGluPheGly                             275280285                                                                     CTCCTCGAGGCCCCCGCCCCCCTGGAGGAGGCCCCCTGGCCCCCGCCG912                           LeuLeuGluAlaProAlaProLe uGluGluAlaProTrpProProPro                             290295300                                                                     GAAGGGGCCTTCGTGGGCTTCGTCCTCTCCCGCCCCGAGCCCATGTGG960                           GluGlyAlaPheValGlyPheValLeuSe rArgProGluProMetTrp                             305310315320                                                                  GCGGAGCTTAAAGCCCTGGCCGCCTGCAAGGAGGGCCGGGTGCACCGG1008                          AlaGluLeuLysAlaLeuAlaAl aCysLysGluGlyArgValHisArg                             325330335                                                                     GCAAAGGACCCCTTGGCGGGGCTAAAGGACCTCAAGGAGGTCCGAGGC1056                          AlaLysAspProLeuAlaGl yLeuLysAspLeuLysGluValArgGly                             340345350                                                                     CTCCTCGCCAAGGACCTCGCCGTTTTGGCCCTTCGCGAGGGGCTGGAC1104                          LeuLeuAlaLysAspLeuAl aValLeuAlaLeuArgGluGlyLeuAsp                             355360365                                                                     CTCGCGCCTTCGGACGACCCCATGCTCCTCGCCTACCTCCTGGACCCC1152                          LeuAlaProSerAspAspProMe tLeuLeuAlaTyrLeuLeuAspPro                             370375380                                                                     TCCAACACCACCCCCGAGGGGGTGGCCCGGCGCTACGGGGGGGAGTGG1200                          SerAsnThrThrProGluGlyValAlaAr gArgTyrGlyGlyGluTrp                             385390395400                                                                  ACGGAGGACGCCGCCCACCGGGCCCTCCTCGCCGAGCGGCTCCAGCAA1248                          ThrGluAspAlaAlaHisArgAl aLeuLeuAlaGluArgLeuGlnGln                             405410415                                                                     AACCTCTTGGAACGCCTCAAGGGAGAGGAAAAGCTCCTTTGGCTCTAC1296                          AsnLeuLeuGluArgLeuLy sGlyGluGluLysLeuLeuTrpLeuTyr                             420425430                                                                     CAAGAGGTGGAAAAGCCCCTCTCCCGGGTCCTGGCCCACATGGAGGCC1344                          GlnGluValGluLysProLe uSerArgValLeuAlaHisMetGluAla                             435440445                                                                     ACCGGGGTAAGGCTGGACGTGGCCTATCTAAAGGCCCTTTCCCTGGAG1392                          ThrGlyValArgLeuAspValAl aTyrLeuLysAlaLeuSerLeuGlu                             450455460                                                                     CTTGCGGAGGAGATTCGCCGCCTCGAGGAGGAGGTCTTCCGCCTGGCG1440                          LeuAlaGluGluIleArgArgLeuGluGl uGluValPheArgLeuAla                             465470475480                                                                  GGCCACCCCTTCAACCTGAACTCCCGTGACCAGCTAGAGCGGGTGCTC1488                          GlyHisProPheAsnLeuAsnSe rArgAspGlnLeuGluArgValLeu                             485490495                                                                     TTTGACGAGCTTAGGCTTCCCGCCCTGGGCAAGACGCAAAAGACGGGG1536                          PheAspGluLeuArgLeuPr oAlaLeuGlyLysThrGlnLysThrGly                             500505510                                                                     AAGCGCTCCACCAGCGCCGCGGTGCTGGAGGCCCTCAGGGAGGCCCAC1584                          LysArgSerThrSerAlaAl aValLeuGluAlaLeuArgGluAlaHis                             515520525                                                                     CCCATCGTGGAGAAGATCCTCCAGCACCGGGAGCTCACCAAGCTCAAG1632                          ProIleValGluLysIleLeuGl nHisArgGluLeuThrLysLeuLys                             530535540                                                                     AACACCTACGTGGACCCCCTCCCGGGCCTCGTCCACCCGAGGACGGGC1680                          AsnThrTyrValAspProLeuProGlyLe uValHisProArgThrGly                             545550555560                                                                  CGCCTCCACACCCGCTTCAACCAGACAGCCACGGCCACGGGAAGGCTC1728                          ArgLeuHisThrArgPheAsnGl nThrAlaThrAlaThrGlyArgLeu                             565570575                                                                     TCTAGCTCCGACCCCAACCTGCAGAACATCCCCATCCGCACCCCCTTG1776                          SerSerSerAspProAsnLe uGlnAsnIleProIleArgThrProLeu                             580585590                                                                     GGCCAGAGGATCCGCCGGGCCTTCGTGGCCGAGGCGGGATGGGCGTTG1824                          GlyGlnArgIleArgArgAl aPheValAlaGluAlaGlyTrpAlaLeu                             595600605                                                                     GTGGCCCTGGACTATAGCCAGATAGAGCTCCGGGTCCTCGCCCACCTC1872                          ValAlaLeuAspTyrSerGlnIl eGluLeuArgValLeuAlaHisLeu                             610615620                                                                     TCCGGGGACGAGAACCTGATCAGGGTCTTCCAGGAGGGGAAGGACATC1920                          SerGlyAspGluAsnLeuIleArgValPh eGlnGluGlyLysAspIle                             625630635640                                                                  CACACCCAGACCGCAAGCTGGATGTTCGGCGTCTCCCCGGAGGCCGTG1968                          HisThrGlnThrAlaSerTrpMe tPheGlyValSerProGluAlaVal                             645650655                                                                     GACCCCCTGATGCGCCGGGCGGCCAAGACGGTGAACTTCGGCGTCCTC2016                          AspProLeuMetArgArgAl aAlaLysThrValAsnPheGlyValLeu                             660665670                                                                     TACGGCATGTCCGCCCATAGGCTCTCCCAGGAGCTTGCCATCCCCTAC2064                          TyrGlyMetSerAlaHisAr gLeuSerGlnGluLeuAlaIleProTyr                             675680685                                                                     GAGGAGGCGGTGGCCTTTATAGAGCGCTACTTCCAAAGCTTCCCCAAG2112                          GluGluAlaValAlaPheIleGl uArgTyrPheGlnSerPheProLys                             690695700                                                                     GTGCGGGCCTGGATAGAAAAGACCCTGGAGGAGGGGAGGAAGCGGGGC2160                          ValArgAlaTrpIleGluLysThrLeuGl uGluGlyArgLysArgGly                             705710715720                                                                  TACGTGGAAACCCTCTTCGGAAGAAGGCGCTACGTGCCCGACCTCAAC2208                          TyrValGluThrLeuPheGlyAr gArgArgTyrValProAspLeuAsn                             725730735                                                                     GCCCGGGTGAAGAGCGTCAGGGAGGCCGCGGAGCGCATGGCCTTCAAC2256                          AlaArgValLysSerValAr gGluAlaAlaGluArgMetAlaPheAsn                             740745750                                                                     ATGCCCGTCCAGGGCACCGCCGCCGACCTCATGAAGCTCGCCATGGTG2304                          MetProValGlnGlyThrAl aAlaAspLeuMetLysLeuAlaMetVal                             755760765                                                                     AAGCTCTTCCCCCACCTCCGGGAGATGGGGGCCCGCATGCTCCTCCAG2352                          LysLeuPheProHisLeuArgGl uMetGlyAlaArgMetLeuLeuGln                             770775780                                                                     GTCCACGACGAGCTCCTCCTGGAGGCCCCCCAAGCGCGGGCCGAGGAG2400                          ValHisAspGluLeuLeuLeuGluAlaPr oGlnAlaArgAlaGluGlu                             785790795800                                                                  GTGGCGGCTTTGGCCAAGGAGGCCATGGAGAAGGCCTATCCCCTCGCC2448                          ValAlaAlaLeuAlaLysGluAl aMetGluLysAlaTyrProLeuAla                             805810815                                                                     GTGCCCCTGGAGGTGGAGGTGGGGATCGGGGAGGACTGGCTTTCCGCC2496                          ValProLeuGluValGluVa lGlyIleGlyGluAspTrpLeuSerAla                             820825830                                                                     AAGGGCTGA2505                                                                 LysGly                                                                        (2) INFORMATION FOR SEQ ID NO:8:                                               (i) SEQUENCE CHARACTERISTICS:                                                (A) LENGTH: 834 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                       MetLysAlaMetLeuProLeuPheGluProLysGlyArgValLeuLeu                              15 1015                                                                       ValAspGlyHisHisLeuAlaTyrArgThrPhePheAlaLeuLysGly                              202530                                                                        LeuThrThrSerArgGlyGluProValGlnAlaV alTyrGlyPheAla                             354045                                                                        LysSerLeuLeuLysAlaLeuLysGluAspGlyTyrLysAlaValPhe                              505560                                                                        ValVal PheAspAlaLysAlaProSerPheArgHisGluAlaTyrGlu                             65707580                                                                      AlaTyrLysAlaGlyArgAlaProThrProGluAspPheProArgGln                               859095                                                                       LeuAlaLeuIleLysGluLeuValAspLeuLeuGlyPheThrArgLeu                              100105110                                                                     GluValProGlyPhe GluAlaAspAspValLeuAlaThrLeuAlaLys                             115120125                                                                     LysAlaGluArgGluGlyTyrGluValArgIleLeuThrAlaAspArg                              130135 140                                                                    AspLeuTyrGlnLeuValSerAspArgValAlaValLeuHisProGlu                              145150155160                                                                  GlyHisLeuIleThrProGluTrpLeuTrpGluLysT yrGlyLeuLys                             165170175                                                                     ProGluGlnTrpValAspPheArgAlaLeuValGlyAspProSerAsp                              18018519 0                                                                    AsnLeuProGlyValLysGlyIleGlyGluLysThrAlaLeuLysLeu                              195200205                                                                     LeuLysGluTrpGlySerLeuGluAsnIleLeuLysAsnLeuAspArg                              210 215220                                                                    ValLysProGluSerValArgGluArgIleLysAlaHisLeuGluAsp                              225230235240                                                                  LeuLysLeuSerLeuGlu LeuSerArgValArgSerAspLeuProLeu                             245250255                                                                     GluValAspPheAlaArgArgArgGluProAspArgGluGlyLeuArg                              260 265270                                                                    AlaPheLeuGluArgLeuGluPheGlySerLeuLeuHisGluPheGly                              275280285                                                                     LeuLeuGluAlaProAlaProLeuGluGluAlaProT rpProProPro                             290295300                                                                     GluGlyAlaPheValGlyPheValLeuSerArgProGluProMetTrp                              305310315320                                                                   AlaGluLeuLysAlaLeuAlaAlaCysLysGluGlyArgValHisArg                             325330335                                                                     AlaLysAspProLeuAlaGlyLeuLysAspLeuLysGluValArgGly                               340345350                                                                    LeuLeuAlaLysAspLeuAlaValLeuAlaLeuArgGluGlyLeuAsp                              355360365                                                                     LeuAlaProSerAspAsp ProMetLeuLeuAlaTyrLeuLeuAspPro                             370375380                                                                     SerAsnThrThrProGluGlyValAlaArgArgTyrGlyGlyGluTrp                              3853903 95400                                                                 ThrGluAspAlaAlaHisArgAlaLeuLeuAlaGluArgLeuGlnGln                              405410415                                                                     AsnLeuLeuGluArgLeuLysGlyGluGluLysL euLeuTrpLeuTyr                             420425430                                                                     GlnGluValGluLysProLeuSerArgValLeuAlaHisMetGluAla                              435440445                                                                      ThrGlyValArgLeuAspValAlaTyrLeuLysAlaLeuSerLeuGlu                             450455460                                                                     LeuAlaGluGluIleArgArgLeuGluGluGluValPheArgLeuAla                              465 470475480                                                                 GlyHisProPheAsnLeuAsnSerArgAspGlnLeuGluArgValLeu                              485490495                                                                     PheAspGluLeuArg LeuProAlaLeuGlyLysThrGlnLysThrGly                             500505510                                                                     LysArgSerThrSerAlaAlaValLeuGluAlaLeuArgGluAlaHis                              5155 20525                                                                    ProIleValGluLysIleLeuGlnHisArgGluLeuThrLysLeuLys                              530535540                                                                     AsnThrTyrValAspProLeuProGlyLeuValHisProArgT hrGly                             545550555560                                                                  ArgLeuHisThrArgPheAsnGlnThrAlaThrAlaThrGlyArgLeu                              56557057 5                                                                    SerSerSerAspProAsnLeuGlnAsnIleProIleArgThrProLeu                              580585590                                                                     GlyGlnArgIleArgArgAlaPheValAlaGluAlaGlyTrpAlaLeu                               595600605                                                                    ValAlaLeuAspTyrSerGlnIleGluLeuArgValLeuAlaHisLeu                              610615620                                                                     SerGlyAspGluAsnLeuIleArg ValPheGlnGluGlyLysAspIle                             625630635640                                                                  HisThrGlnThrAlaSerTrpMetPheGlyValSerProGluAlaVal                              645 650655                                                                    AspProLeuMetArgArgAlaAlaLysThrValAsnPheGlyValLeu                              660665670                                                                     TyrGlyMetSerAlaHisArgLeuSerGlnGluL euAlaIleProTyr                             675680685                                                                     GluGluAlaValAlaPheIleGluArgTyrPheGlnSerPheProLys                              690695700                                                                     ValArg AlaTrpIleGluLysThrLeuGluGluGlyArgLysArgGly                             705710715720                                                                  TyrValGluThrLeuPheGlyArgArgArgTyrValProAspLeuAsn                               725730735                                                                    AlaArgValLysSerValArgGluAlaAlaGluArgMetAlaPheAsn                              740745750                                                                     MetProValGlnGly ThrAlaAlaAspLeuMetLysLeuAlaMetVal                             755760765                                                                     LysLeuPheProHisLeuArgGluMetGlyAlaArgMetLeuLeuGln                              770775 780                                                                    ValHisAspGluLeuLeuLeuGluAlaProGlnAlaArgAlaGluGlu                              785790795800                                                                  ValAlaAlaLeuAlaLysGluAlaMetGluLysAlaT yrProLeuAla                             805810815                                                                     ValProLeuGluValGluValGlyIleGlyGluAspTrpLeuSerAla                              82082583 0                                                                    LysGly                                                                        (2) INFORMATION FOR SEQ ID NO:9:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 2505 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: Thermus thermophilus                                            ( ix) FEATURE:                                                                (A) NAME/KEY: CDS                                                             (B) LOCATION: 1..2502                                                         (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                       ATGGAGGCGATGCTTCCGCTCTTTGAACCCAAAGGCCGGGTCCTCCTG48                            MetGluAlaMetLeuProLeuPheGluProLysGlyArgValLeuLeu                              151015                                                                        GTGGACGGCCACCACCTGGCCTACCGCACCTTCTTCGCCCTGAAGGGC96                            ValAspGlyHisHisLeuAlaTyrArgThrPhePheAlaLeuLys Gly                             202530                                                                        CTCACCACGAGCCGGGGCGAACCGGTGCAGGCGGTCTACGGCTTCGCC144                           LeuThrThrSerArgGlyGluProValGlnAlaValTyrGlyPhe Ala                             354045                                                                        AAGAGCCTCCTCAAGGCCCTGAAGGAGGACGGGTACAAGGCCGTCTTC192                           LysSerLeuLeuLysAlaLeuLysGluAspGlyTyrLysAlaValPhe                              505560                                                                        GTGGTCTTTGACGCCAAGGCCCCCTCCTTCCGCCACGAGGCCTACGAG240                           ValValPheAspAlaLysAlaProSerPheArgHisGluAlaTyrGlu                              65 707580                                                                     GCCTACAAGGCGGGGAGGGCCCCGACCCCCGAGGACTTCCCCCGGCAG288                           AlaTyrLysAlaGlyArgAlaProThrProGluAspPheProArgGln                              859095                                                                        CTCGCCCTCATCAAGGAGCTGGTGGACCTCCTGGGGTTTACCCGCCTC336                           LeuAlaLeuIleLysGluLeuValAspLeuLeuGlyPheThrArg Leu                             100105110                                                                     GAGGTCCCCGGCTACGAGGCGGACGACGTTCTCGCCACCCTGGCCAAG384                           GluValProGlyTyrGluAlaAspAspValLeuAlaThrLeuAla Lys                             115120125                                                                     AAGGCGGAAAAGGAGGGGTACGAGGTGCGCATCCTCACCGCCGACCGC432                           LysAlaGluLysGluGlyTyrGluValArgIleLeuThrAlaAspArg                              130135140                                                                     GACCTCTACCAACTCGTCTCCGACCGCGTCGCCGTCCTCCACCCCGAG480                           AspLeuTyrGlnLeuValSerAspArgValAlaValLeuHisProGlu                              145 150155160                                                                 GGCCACCTCATCACCCCGGAGTGGCTTTGGGAGAAGTACGGCCTCAGG528                           GlyHisLeuIleThrProGluTrpLeuTrpGluLysTyrGlyLeuArg                              165170175                                                                     CCGGAGCAGTGGGTGGACTTCCGCGCCCTCGTGGGGGACCCCTCCGAC576                           ProGluGlnTrpValAspPheArgAlaLeuValGlyAspProSer Asp                             180185190                                                                     AACCTCCCCGGGGTCAAGGGCATCGGGGAGAAGACCGCCCTCAAGCTC624                           AsnLeuProGlyValLysGlyIleGlyGluLysThrAlaLeuLys Leu                             195200205                                                                     CTCAAGGAGTGGGGAAGCCTGGAAAACCTCCTCAAGAACCTGGACCGG672                           LeuLysGluTrpGlySerLeuGluAsnLeuLeuLysAsnLeuAspArg                              210215220                                                                     GTAAAGCCAGAAAACGTCCGGGAGAAGATCAAGGCCCACCTGGAAGAC720                           ValLysProGluAsnValArgGluLysIleLysAlaHisLeuGluAsp                              225 230235240                                                                 CTCAGGCTCTCCTTGGAGCTCTCCCGGGTGCGCACCGACCTCCCCCTG768                           LeuArgLeuSerLeuGluLeuSerArgValArgThrAspLeuProLeu                              245250255                                                                     GAGGTGGACCTCGCCCAGGGGCGGGAGCCCGACCGGGAGGGGCTTAGG816                           GluValAspLeuAlaGlnGlyArgGluProAspArgGluGlyLeu Arg                             260265270                                                                     GCCTTCCTGGAGAGGCTGGAGTTCGGCAGCCTCCTCCACGAGTTCGGC864                           AlaPheLeuGluArgLeuGluPheGlySerLeuLeuHisGluPhe Gly                             275280285                                                                     CTCCTGGAGGCCCCCGCCCCCCTGGAGGAGGCCCCCTGGCCCCCGCCG912                           LeuLeuGluAlaProAlaProLeuGluGluAlaProTrpProProPro                              290295300                                                                     GAAGGGGCCTTCGTGGGCTTCGTCCTCTCCCGCCCCGAGCCCATGTGG960                           GluGlyAlaPheValGlyPheValLeuSerArgProGluProMetTrp                              305 310315320                                                                 GCGGAGCTTAAAGCCCTGGCCGCCTGCAGGGACGGCCGGGTGCACCGG1008                          AlaGluLeuLysAlaLeuAlaAlaCysArgAspGlyArgValHisArg                              325330335                                                                     GCAGCAGACCCCTTGGCGGGGCTAAAGGACCTCAAGGAGGTCCGGGGC1056                          AlaAlaAspProLeuAlaGlyLeuLysAspLeuLysGluValArg Gly                             340345350                                                                     CTCCTCGCCAAGGACCTCGCCGTCTTGGCCTCGAGGGAGGGGCTAGAC1104                          LeuLeuAlaLysAspLeuAlaValLeuAlaSerArgGluGlyLeu Asp                             355360365                                                                     CTCGTGCCCGGGGACGACCCCATGCTCCTCGCCTACCTCCTGGACCCC1152                          LeuValProGlyAspAspProMetLeuLeuAlaTyrLeuLeuAspPro                              370375380                                                                     TCCAACACCACCCCCGAGGGGGTGGCGCGGCGCTACGGGGGGGAGTGG1200                          SerAsnThrThrProGluGlyValAlaArgArgTyrGlyGlyGluTrp                              385 390395400                                                                 ACGGAGGACGCCGCCCACCGGGCCCTCCTCTCGGAGAGGCTCCATCGG1248                          ThrGluAspAlaAlaHisArgAlaLeuLeuSerGluArgLeuHisArg                              405410415                                                                     AACCTCCTTAAGCGCCTCGAGGGGGAGGAGAAGCTCCTTTGGCTCTAC1296                          AsnLeuLeuLysArgLeuGluGlyGluGluLysLeuLeuTrpLeu Tyr                             420425430                                                                     CACGAGGTGGAAAAGCCCCTCTCCCGGGTCCTGGCCCACATGGAGGCC1344                          HisGluValGluLysProLeuSerArgValLeuAlaHisMetGlu Ala                             435440445                                                                     ACCGGGGTACGGCTGGACGTGGCCTACCTTCAGGCCCTTTCCCTGGAG1392                          ThrGlyValArgLeuAspValAlaTyrLeuGlnAlaLeuSerLeuGlu                              450455460                                                                     CTTGCGGAGGAGATCCGCCGCCTCGAGGAGGAGGTCTTCCGCTTGGCG1440                          LeuAlaGluGluIleArgArgLeuGluGluGluValPheArgLeuAla                              465 470475480                                                                 GGCCACCCCTTCAACCTCAACTCCCGGGACCAGCTGGAAAGGGTGCTC1488                          GlyHisProPheAsnLeuAsnSerArgAspGlnLeuGluArgValLeu                              485490495                                                                     TTTGACGAGCTTAGGCTTCCCGCCTTGGGGAAGACGCAAAAGACAGGC1536                          PheAspGluLeuArgLeuProAlaLeuGlyLysThrGlnLysThr Gly                             500505510                                                                     AAGCGCTCCACCAGCGCCGCGGTGCTGGAGGCCCTACGGGAGGCCCAC1584                          LysArgSerThrSerAlaAlaValLeuGluAlaLeuArgGluAla His                             515520525                                                                     CCCATCGTGGAGAAGATCCTCCAGCACCGGGAGCTCACCAAGCTCAAG1632                          ProIleValGluLysIleLeuGlnHisArgGluLeuThrLysLeuLys                              530535540                                                                     AACACCTACGTGGACCCCCTCCCAAGCCTCGTCCACCCGAGGACGGGC1680                          AsnThrTyrValAspProLeuProSerLeuValHisProArgThrGly                              545 550555560                                                                 CGCCTCCACACCCGCTTCAACCAGACGGCCACGGCCACGGGGAGGCTT1728                          ArgLeuHisThrArgPheAsnGlnThrAlaThrAlaThrGlyArgLeu                              565570575                                                                     AGTAGCTCCGACCCCAACCTGCAGAACATCCCCGTCCGCACCCCCTTG1776                          SerSerSerAspProAsnLeuGlnAsnIleProValArgThrPro Leu                             580585590                                                                     GGCCAGAGGATCCGCCGGGCCTTCGTGGCCGAGGCGGGTTGGGCGTTG1824                          GlyGlnArgIleArgArgAlaPheValAlaGluAlaGlyTrpAla Leu                             595600605                                                                     GTGGCCCTGGACTATAGCCAGATAGAGCTCCGCGTCCTCGCCCACCTC1872                          ValAlaLeuAspTyrSerGlnIleGluLeuArgValLeuAlaHisLeu                              610615620                                                                     TCCGGGGACGAAAACCTGATCAGGGTCTTCCAGGAGGGGAAGGACATC1920                          SerGlyAspGluAsnLeuIleArgValPheGlnGluGlyLysAspIle                              625 630635640                                                                 CACACCCAGACCGCAAGCTGGATGTTCGGCGTCCCCCCGGAGGCCGTG1968                          HisThrGlnThrAlaSerTrpMetPheGlyValProProGluAlaVal                              645650655                                                                     GACCCCCTGATGCGCCGGGCGGCCAAGACGGTGAACTTCGGCGTCCTC2016                          AspProLeuMetArgArgAlaAlaLysThrValAsnPheGlyVal Leu                             660665670                                                                     TACGGCATGTCCGCCCATAGGCTCTCCCAGGAGCTTGCCATCCCCTAC2064                          TyrGlyMetSerAlaHisArgLeuSerGlnGluLeuAlaIlePro Tyr                             675680685                                                                     GAGGAGGCGGTGGCCTTTATAGAGCGCTACTTCCAAAGCTTCCCCAAG2112                          GluGluAlaValAlaPheIleGluArgTyrPheGlnSerPheProLys                              690695700                                                                     GTGCGGGCCTGGATAGAAAAGACCCTGGAGGAGGGGAGGAAGCGGGGC2160                          ValArgAlaTrpIleGluLysThrLeuGluGluGlyArgLysArgGly                              705 710715720                                                                 TACGTGGAAACCCTCTTCGGAAGAAGGCGCTACGTGCCCGACCTCAAC2208                          TyrValGluThrLeuPheGlyArgArgArgTyrValProAspLeuAsn                              725730735                                                                     GCCCGGGTGAAGAGCGTCAGGGAGGCCGCGGAGCGCATGGCCTTCAAC2256                          AlaArgValLysSerValArgGluAlaAlaGluArgMetAlaPhe Asn                             740745750                                                                     ATGCCCGTCCAGGGCACCGCCGCCGACCTCATGAAGCTCGCCATGGTG2304                          MetProValGlnGlyThrAlaAlaAspLeuMetLysLeuAlaMet Val                             755760765                                                                     AAGCTCTTCCCCCGCCTCCGGGAGATGGGGGCCCGCATGCTCCTCCAG2352                          LysLeuPheProArgLeuArgGluMetGlyAlaArgMetLeuLeuGln                              770775780                                                                     GTCCACGACGAGCTCCTCCTGGAGGCCCCCCAAGCGCGGGCCGAGGAG2400                          ValHisAspGluLeuLeuLeuGluAlaProGlnAlaArgAlaGluGlu                              785 790795800                                                                 GTGGCGGCTTTGGCCAAGGAGGCCATGGAGAAGGCCTATCCCCTCGCC2448                          ValAlaAlaLeuAlaLysGluAlaMetGluLysAlaTyrProLeuAla                              805810815                                                                     GTGCCCCTGGAGGTGGAGGTGGGGATGGGGGAGGACTGGCTTTCCGCC2496                          ValProLeuGluValGluValGlyMetGlyGluAspTrpLeuSer Ala                             820825830                                                                     AAGGGTTAG2505                                                                 LysGly                                                                        (2) INFORMATION FOR SEQ ID NO:10:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 834 amino acids                                                    (B) TYPE: amino acid                                                         (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                      MetGluAlaMetLeuProLeuPheGluProLysGlyArgValLeuLeu                              151015                                                                        Val AspGlyHisHisLeuAlaTyrArgThrPhePheAlaLeuLysGly                             202530                                                                        LeuThrThrSerArgGlyGluProValGlnAlaValTyrGlyPheAla                              35 4045                                                                       LysSerLeuLeuLysAlaLeuLysGluAspGlyTyrLysAlaValPhe                              505560                                                                        ValValPheAspAlaLysAlaProSerPheA rgHisGluAlaTyrGlu                             65707580                                                                      AlaTyrLysAlaGlyArgAlaProThrProGluAspPheProArgGln                              8590 95                                                                       LeuAlaLeuIleLysGluLeuValAspLeuLeuGlyPheThrArgLeu                              100105110                                                                     GluValProGlyTyrGluAlaAspAspValLeuAlaThrLe uAlaLys                             115120125                                                                     LysAlaGluLysGluGlyTyrGluValArgIleLeuThrAlaAspArg                              130135140                                                                     AspLeuTyrGln LeuValSerAspArgValAlaValLeuHisProGlu                             145150155160                                                                  GlyHisLeuIleThrProGluTrpLeuTrpGluLysTyrGlyLeuArg                              1 65170175                                                                    ProGluGlnTrpValAspPheArgAlaLeuValGlyAspProSerAsp                              180185190                                                                     AsnLeuProGlyValLysGlyI leGlyGluLysThrAlaLeuLysLeu                             195200205                                                                     LeuLysGluTrpGlySerLeuGluAsnLeuLeuLysAsnLeuAspArg                              210215 220                                                                    ValLysProGluAsnValArgGluLysIleLysAlaHisLeuGluAsp                              225230235240                                                                  LeuArgLeuSerLeuGluLeuSerArgValArgThrAspLeuPr oLeu                             245250255                                                                     GluValAspLeuAlaGlnGlyArgGluProAspArgGluGlyLeuArg                              260265270                                                                     Ala PheLeuGluArgLeuGluPheGlySerLeuLeuHisGluPheGly                             275280285                                                                     LeuLeuGluAlaProAlaProLeuGluGluAlaProTrpProProPro                              290 295300                                                                    GluGlyAlaPheValGlyPheValLeuSerArgProGluProMetTrp                              305310315320                                                                  AlaGluLeuLysAlaLeuAlaAlaC ysArgAspGlyArgValHisArg                             325330335                                                                     AlaAlaAspProLeuAlaGlyLeuLysAspLeuLysGluValArgGly                              340345 350                                                                    LeuLeuAlaLysAspLeuAlaValLeuAlaSerArgGluGlyLeuAsp                              355360365                                                                     LeuValProGlyAspAspProMetLeuLeuAlaTyrLeuLeuAs pPro                             370375380                                                                     SerAsnThrThrProGluGlyValAlaArgArgTyrGlyGlyGluTrp                              385390395400                                                                  ThrGlu AspAlaAlaHisArgAlaLeuLeuSerGluArgLeuHisArg                             405410415                                                                     AsnLeuLeuLysArgLeuGluGlyGluGluLysLeuLeuTrpLeuTyr                              4 20425430                                                                    HisGluValGluLysProLeuSerArgValLeuAlaHisMetGluAla                              435440445                                                                     ThrGlyValArgLeuAspValAlaT yrLeuGlnAlaLeuSerLeuGlu                             450455460                                                                     LeuAlaGluGluIleArgArgLeuGluGluGluValPheArgLeuAla                              465470475 480                                                                 GlyHisProPheAsnLeuAsnSerArgAspGlnLeuGluArgValLeu                              485490495                                                                     PheAspGluLeuArgLeuProAlaLeuGlyLysThrGlnLy sThrGly                             500505510                                                                     LysArgSerThrSerAlaAlaValLeuGluAlaLeuArgGluAlaHis                              515520525                                                                     ProIle ValGluLysIleLeuGlnHisArgGluLeuThrLysLeuLys                             530535540                                                                     AsnThrTyrValAspProLeuProSerLeuValHisProArgThrGly                              545550 555560                                                                 ArgLeuHisThrArgPheAsnGlnThrAlaThrAlaThrGlyArgLeu                              565570575                                                                     SerSerSerAspProAsnLeuG lnAsnIleProValArgThrProLeu                             580585590                                                                     GlyGlnArgIleArgArgAlaPheValAlaGluAlaGlyTrpAlaLeu                              595600 605                                                                    ValAlaLeuAspTyrSerGlnIleGluLeuArgValLeuAlaHisLeu                              610615620                                                                     SerGlyAspGluAsnLeuIleArgValPheGlnGluGlyLysAspIle                               625630635640                                                                 HisThrGlnThrAlaSerTrpMetPheGlyValProProGluAlaVal                              645650655                                                                     Asp ProLeuMetArgArgAlaAlaLysThrValAsnPheGlyValLeu                             660665670                                                                     TyrGlyMetSerAlaHisArgLeuSerGlnGluLeuAlaIleProTyr                              675 680685                                                                    GluGluAlaValAlaPheIleGluArgTyrPheGlnSerPheProLys                              690695700                                                                     ValArgAlaTrpIleGluLysThrLeuGluG luGlyArgLysArgGly                             705710715720                                                                  TyrValGluThrLeuPheGlyArgArgArgTyrValProAspLeuAsn                              725730 735                                                                    AlaArgValLysSerValArgGluAlaAlaGluArgMetAlaPheAsn                              740745750                                                                     MetProValGlnGlyThrAlaAlaAspLeuMetLysLeuAl aMetVal                             755760765                                                                     LysLeuPheProArgLeuArgGluMetGlyAlaArgMetLeuLeuGln                              770775780                                                                     ValHisAspGlu LeuLeuLeuGluAlaProGlnAlaArgAlaGluGlu                             785790795800                                                                  ValAlaAlaLeuAlaLysGluAlaMetGluLysAlaTyrProLeuAla                              8 05810815                                                                    ValProLeuGluValGluValGlyMetGlyGluAspTrpLeuSerAla                              820825830                                                                     LysGly                                                                        (2) INFORMATION FOR SEQ ID NO:11:                                              (i) SEQUENCE CHARACTERISTICS:                                                (A) LENGTH: 2679 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: Thermosipho africanus                                           (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                              (B) LOCATION: 1..2676                                                        (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                      ATGGGAAAGATGTTTCTATTTGATGGAACTGGATTAGTATACAGAGCA48                            MetGlyLysMetPheLeuPheAspGlyThrGlyLeuValTyrArgAla                              15 1015                                                                       TTTTATGCTATAGATCAATCTCTTCAAACTTCGTCTGGTTTACACACT96                            PheTyrAlaIleAspGlnSerLeuGlnThrSerSerGlyLeuHisThr                              20 2530                                                                       AATGCTGTATACGGACTTACTAAAATGCTTATAAAATTTTTAAAAGAA144                           AsnAlaValTyrGlyLeuThrLysMetLeuIleLysPheLeuLysGlu                              35 4045                                                                       CATATCAGTATTGGAAAAGATGCTTGTGTTTTTGTTTTAGATTCAAAA192                           HisIleSerIleGlyLysAspAlaCysValPheValLeuAspSerLys                              5055 60                                                                       GGTGGTAGCAAAAAAAGAAAGGATATTCTTGAAACATATAAAGCAAAT240                           GlyGlySerLysLysArgLysAspIleLeuGluThrTyrLysAlaAsn                              6570 7580                                                                     AGGCCATCAACGCCTGATTTACTTTTAGAGCAAATTCCATATGTAGAA288                           ArgProSerThrProAspLeuLeuLeuGluGlnIleProTyrValGlu                              85 9095                                                                       GAACTTGTTGATGCTCTTGGAATAAAAGTTTTAAAAATAGAAGGCTTT336                           GluLeuValAspAlaLeuGlyIleLysValLeuLysIleGluGlyPhe                              100 105110                                                                    GAAGCTGATGACATTATTGCTACGCTTTCTAAAAAATTTGAAAGTGAT384                           GluAlaAspAspIleIleAlaThrLeuSerLysLysPheGluSerAsp                              115 120125                                                                    TTTGAAAAGGTAAACATAATAACTGGAGATAAAGATCTTTTACAACTT432                           PheGluLysValAsnIleIleThrGlyAspLysAspLeuLeuGlnLeu                              130135 140                                                                    GTTTCTGATAAGGTTTTTGTTTGGAGAGTAGAAAGAGGAATAACAGAT480                           ValSerAspLysValPheValTrpArgValGluArgGlyIleThrAsp                              145150 155160                                                                 TTGGTATTGTACGATAGAAATAAAGTGATTGAAAAATATGGAATCTAC528                           LeuValLeuTyrAspArgAsnLysValIleGluLysTyrGlyIleTyr                              165 170175                                                                    CCAGAACAATTCAAAGATTATTTATCTCTTGTCGGTGATCAGATTGAT576                           ProGluGlnPheLysAspTyrLeuSerLeuValGlyAspGlnIleAsp                              180 185190                                                                    AATATCCCAGGAGTTAAAGGAATAGGAAAGAAAACAGCTGTTTCGCTT624                           AsnIleProGlyValLysGlyIleGlyLysLysThrAlaValSerLeu                              195 200205                                                                    TTGAAAAAATATAATAGCTTGGAAAATGTATTAAAAAATATTAACCTT672                           LeuLysLysTyrAsnSerLeuGluAsnValLeuLysAsnIleAsnLeu                              210215 220                                                                    TTGACGGAAAAATTAAGAAGGCTTTTGGAAGATTCAAAGGAAGATTTG720                           LeuThrGluLysLeuArgArgLeuLeuGluAspSerLysGluAspLeu                              225230 235240                                                                 CAAAAAAGTATAGAACTTGTGGAGTTGATATATGATGTACCAATGGAT768                           GlnLysSerIleGluLeuValGluLeuIleTyrAspValProMetAsp                              245 250255                                                                    GTGGAAAAAGATGAAATAATTTATAGAGGGTATAATCCAGATAAGCTT816                           ValGluLysAspGluIleIleTyrArgGlyTyrAsnProAspLysLeu                              260 265270                                                                    TTAAAGGTATTAAAAAAGTACGAATTTTCATCTATAATTAAGGAGTTA864                           LeuLysValLeuLysLysTyrGluPheSerSerIleIleLysGluLeu                              275 280285                                                                    AATTTACAAGAAAAATTAGAAAAGGAATATATACTGGTAGATAATGAA912                           AsnLeuGlnGluLysLeuGluLysGluTyrIleLeuValAspAsnGlu                              290295 300                                                                    GATAAATTGAAAAAACTTGCAGAAGAGATAGAAAAATACAAAACTTTT960                           AspLysLeuLysLysLeuAlaGluGluIleGluLysTyrLysThrPhe                              305310 315320                                                                 TCAATTGATACGGAAACAACTTCACTTGATCCATTTGAAGCTAAACTG1008                          SerIleAspThrGluThrThrSerLeuAspProPheGluAlaLysLeu                              325 330335                                                                    GTTGGGATCTCTATTTCCACAATGGAAGGGAAGGCGTATTATATTCCG1056                          ValGlyIleSerIleSerThrMetGluGlyLysAlaTyrTyrIlePro                              340 345350                                                                    GTGTCTCATTTTGGAGCTAAGAATATTTCCAAAAGTTTAATAGATAAA1104                          ValSerHisPheGlyAlaLysAsnIleSerLysSerLeuIleAspLys                              355 360365                                                                    TTTCTAAAACAAATTTTGCAAGAGAAGGATTATAATATCGTTGGTCAG1152                          PheLeuLysGlnIleLeuGlnGluLysAspTyrAsnIleValGlyGln                              370375 380                                                                    AATTTAAAATTTGACTATGAGATTTTTAAAAGCATGGGTTTTTCTCCA1200                          AsnLeuLysPheAspTyrGluIlePheLysSerMetGlyPheSerPro                              385390 395400                                                                 AATGTTCCGCATTTTGATACGATGATTGCAGCCTATCTTTTAAATCCA1248                          AsnValProHisPheAspThrMetIleAlaAlaTyrLeuLeuAsnPro                              405 410415                                                                    GATGAAAAACGTTTTAATCTTGAAGAGCTATCCTTAAAATATTTAGGT1296                          AspGluLysArgPheAsnLeuGluGluLeuSerLeuLysTyrLeuGly                              420 425430                                                                    TATAAAATGATCTCGTTTGATGAATTAGTAAATGAAAATGTACCATTG1344                          TyrLysMetIleSerPheAspGluLeuValAsnGluAsnValProLeu                              435 440445                                                                    TTTGGAAATGACTTTTCGTATGTTCCACTAGAAAGAGCCGTTGAGTAT1392                          PheGlyAsnAspPheSerTyrValProLeuGluArgAlaValGluTyr                              450455 460                                                                    TCCTGTGAAGATGCCGATGTGACATACAGAATATTTAGAAAGCTTGGT1440                          SerCysGluAspAlaAspValThrTyrArgIlePheArgLysLeuGly                              465470 475480                                                                 AGGAAGATATATGAAAATGAGATGGAAAAGTTGTTTTACGAAATTGAG1488                          ArgLysIleTyrGluAsnGluMetGluLysLeuPheTyrGluIleGlu                              485 490495                                                                    ATGCCCTTAATTGATGTTCTTTCAGAAATGGAACTAAATGGAGTGTAT1536                          MetProLeuIleAspValLeuSerGluMetGluLeuAsnGlyValTyr                              500 505510                                                                    TTTGATGAGGAATATTTAAAAGAATTATCAAAAAAATATCAAGAAAAA1584                          PheAspGluGluTyrLeuLysGluLeuSerLysLysTyrGlnGluLys                              515 520525                                                                    ATGGATGGAATTAAGGAAAAAGTTTTTGAGATAGCTGGTGAAACTTTC1632                          MetAspGlyIleLysGluLysValPheGluIleAlaGlyGluThrPhe                              530535 540                                                                    AATTTAAACTCTTCAACTCAAGTAGCATATATACTATTTGAAAAATTA1680                          AsnLeuAsnSerSerThrGlnValAlaTyrIleLeuPheGluLysLeu                              545550 555560                                                                 AATATTGCTCCTTACAAAAAAACAGCGACTGGTAAGTTTTCAACTAAT1728                          AsnIleAlaProTyrLysLysThrAlaThrGlyLysPheSerThrAsn                              565 570575                                                                    GCGGAAGTTTTAGAAGAACTTTCAAAAGAACATGAAATTGCAAAATTG1776                          AlaGluValLeuGluGluLeuSerLysGluHisGluIleAlaLysLeu                              580 585590                                                                    TTGCTGGAGTATCGAAAGTATCAAAAATTAAAAAGTACATATATTGAT1824                          LeuLeuGluTyrArgLysTyrGlnLysLeuLysSerThrTyrIleAsp                              595 600605                                                                    TCAATACCGTTATCTATTAATCGAAAAACAAACAGGGTCCATACTACT1872                          SerIleProLeuSerIleAsnArgLysThrAsnArgValHisThrThr                              610615 620                                                                    TTTCATCAAACAGGAACTTCTACTGGAAGATTAAGTAGTTCAAATCCA1920                          PheHisGlnThrGlyThrSerThrGlyArgLeuSerSerSerAsnPro                              625630 635640                                                                 AATTTGCAAAATCTTCCAACAAGAAGCGAAGAAGGAAAAGAAATAAGA1968                          AsnLeuGlnAsnLeuProThrArgSerGluGluGlyLysGluIleArg                              645 650655                                                                    AAAGCAGTAAGACCTCAAAGACAAGATTGGTGGATTTTAGGTGCTGAC2016                          LysAlaValArgProGlnArgGlnAspTrpTrpIleLeuGlyAlaAsp                              660 665670                                                                    TATTCTCAGATAGAACTAAGGGTTTTAGCGCATGTAAGTAAAGATGAA2064                          TyrSerGlnIleGluLeuArgValLeuAlaHisValSerLysAspGlu                              675 680685                                                                    AATCTACTTAAAGCATTTAAAGAAGATTTAGATATTCATACAATTACT2112                          AsnLeuLeuLysAlaPheLysGluAspLeuAspIleHisThrIleThr                              690695 700                                                                    GCTGCCAAAATTTTTGGTGTTTCAGAGATGTTTGTTAGTGAACAAATG2160                          AlaAlaLysIlePheGlyValSerGluMetPheValSerGluGlnMet                              705710 715720                                                                 AGAAGAGTTGGAAAGATGGTAAATTTTGCAATTATTTATGGAGTTTCA2208                          ArgArgValGlyLysMetValAsnPheAlaIleIleTyrGlyValSer                              725 730735                                                                    CCTTATGGTCTTTCAAAGAGAATTGGTCTTAGTGTTTCAGAGACTAAA2256                          ProTyrGlyLeuSerLysArgIleGlyLeuSerValSerGluThrLys                              740 745750                                                                    AAAATAATAGATAACTATTTTAGATACTATAAAGGAGTTTTTGAATAT2304                          LysIleIleAspAsnTyrPheArgTyrTyrLysGlyValPheGluTyr                              755 760765                                                                    TTAAAAAGGATGAAAGATGAAGCAAGGAAAAAAGGTTATGTTACAACG2352                          LeuLysArgMetLysAspGluAlaArgLysLysGlyTyrValThrThr                              770775 780                                                                    CTTTTTGGAAGGCGCAGATATATTCCACAGTTAAGATCGAAAAATGGT2400                          LeuPheGlyArgArgArgTyrIleProGlnLeuArgSerLysAsnGly                              785790 795800                                                                 AATAGAGTTCAAGAAGGAGAAAGAATAGCTGTAAACACTCCAATTCAA2448                          AsnArgValGlnGluGlyGluArgIleAlaValAsnThrProIleGln                              805 810815                                                                    GGAACAGCAGCTGATATAATAAAGATAGCTATGATTAATATTCATAAT2496                          GlyThrAlaAlaAspIleIleLysIleAlaMetIleAsnIleHisAsn                              820 825830                                                                    AGATTGAAGAAGGAAAATCTACGTTCAAAAATGATATTGCAGGTTCAT2544                          ArgLeuLysLysGluAsnLeuArgSerLysMetIleLeuGlnValHis                              835 840845                                                                    GACGAGTTAGTTTTTGAAGTGCCCGATAATGAACTGGAGATTGTAAAA2592                          AspGluLeuValPheGluValProAspAsnGluLeuGluIleValLys                              850855 860                                                                    GATTTAGTAAGAGATGAGATGGAAAATGCAGTTAAGCTAGACGTTCCT2640                          AspLeuValArgAspGluMetGluAsnAlaValLysLeuAspValPro                              865870 875880                                                                 TTAAAAGTAGATGTTTATTATGGAAAAGAGTGGGAATAA2679                                   LeuLysValAspValTyrTyrGlyLysGluTrpGlu                                          885890                                                                        (2) INFORMATION FOR SEQ ID NO:12:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 892 amino acids                                                   (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                      MetGlyLysMetPheLeuPheAspGlyThrGlyLeuValTyrArgAla                              15 1015                                                                       PheTyrAlaIleAspGlnSerLeuGlnThrSerSerGlyLeuHisThr                              202530                                                                        AsnAlaValTyrGlyLeuThrLysMet LeuIleLysPheLeuLysGlu                             354045                                                                        HisIleSerIleGlyLysAspAlaCysValPheValLeuAspSerLys                              505560                                                                        GlyGlySerLysLysArgLysAspIleLeuGluThrTyrLysAlaAsn                              65707580                                                                      ArgProSerThrProAspLeuLeuLeuGluGlnIleProTyrValGlu                              859095                                                                        GluLeuValAspAlaLeuGlyIleLysValLeuLysIleGluGlyPhe                              100105110                                                                     GluAlaAs pAspIleIleAlaThrLeuSerLysLysPheGluSerAsp                             115120125                                                                     PheGluLysValAsnIleIleThrGlyAspLysAspLeuLeuGlnLeu                              130 135140                                                                    ValSerAspLysValPheValTrpArgValGluArgGlyIleThrAsp                              145150155160                                                                  LeuValLeuTyrAspArgAsnLysValIle GluLysTyrGlyIleTyr                             165170175                                                                     ProGluGlnPheLysAspTyrLeuSerLeuValGlyAspGlnIleAsp                              180185 190                                                                    AsnIleProGlyValLysGlyIleGlyLysLysThrAlaValSerLeu                              195200205                                                                     LeuLysLysTyrAsnSerLeuGluAsnValLeuLysAsnIleAsnLeu                              210215220                                                                     LeuThrGluLysLeuArgArgLeuLeuGluAspSerLysGluAspLeu                              225230235240                                                                  GlnLysSerIl eGluLeuValGluLeuIleTyrAspValProMetAsp                             245250255                                                                     ValGluLysAspGluIleIleTyrArgGlyTyrAsnProAspLysLeu                              260 265270                                                                    LeuLysValLeuLysLysTyrGluPheSerSerIleIleLysGluLeu                              275280285                                                                     AsnLeuGlnGluLysLeuGluLysGluTyr IleLeuValAspAsnGlu                             290295300                                                                     AspLysLeuLysLysLeuAlaGluGluIleGluLysTyrLysThrPhe                              305310315 320                                                                 SerIleAspThrGluThrThrSerLeuAspProPheGluAlaLysLeu                              325330335                                                                     ValGlyIleSerIleSerThrMetGluGlyLysAlaTyrTyrIle Pro                             340345350                                                                     ValSerHisPheGlyAlaLysAsnIleSerLysSerLeuIleAspLys                              355360365                                                                     PheLeuLysGl nIleLeuGlnGluLysAspTyrAsnIleValGlyGln                             370375380                                                                     AsnLeuLysPheAspTyrGluIlePheLysSerMetGlyPheSerPro                              385390 395400                                                                 AsnValProHisPheAspThrMetIleAlaAlaTyrLeuLeuAsnPro                              405410415                                                                     AspGluLysArgPheAsnLeuGluGlu LeuSerLeuLysTyrLeuGly                             420425430                                                                     TyrLysMetIleSerPheAspGluLeuValAsnGluAsnValProLeu                              435440 445                                                                    PheGlyAsnAspPheSerTyrValProLeuGluArgAlaValGluTyr                              450455460                                                                     SerCysGluAspAlaAspValThrTyrArgIlePheArgLysLeuGly                              465 470475480                                                                 ArgLysIleTyrGluAsnGluMetGluLysLeuPheTyrGluIleGlu                              485490495                                                                     MetProLe uIleAspValLeuSerGluMetGluLeuAsnGlyValTyr                             500505510                                                                     PheAspGluGluTyrLeuLysGluLeuSerLysLysTyrGlnGluLys                              515 520525                                                                    MetAspGlyIleLysGluLysValPheGluIleAlaGlyGluThrPhe                              530535540                                                                     AsnLeuAsnSerSerThrGlnValAlaTyrIleLeu PheGluLysLeu                             545550555560                                                                  AsnIleAlaProTyrLysLysThrAlaThrGlyLysPheSerThrAsn                              565570 575                                                                    AlaGluValLeuGluGluLeuSerLysGluHisGluIleAlaLysLeu                              580585590                                                                     LeuLeuGluTyrArgLysTyrGlnLysLeuLysSerThrTyrIle Asp                             595600605                                                                     SerIleProLeuSerIleAsnArgLysThrAsnArgValHisThrThr                              610615620                                                                     PheHisGlnThrGlyTh rSerThrGlyArgLeuSerSerSerAsnPro                             625630635640                                                                  AsnLeuGlnAsnLeuProThrArgSerGluGluGlyLysGluIleArg                              645 650655                                                                    LysAlaValArgProGlnArgGlnAspTrpTrpIleLeuGlyAlaAsp                              660665670                                                                     TyrSerGlnIleGluLeuArgValLeu AlaHisValSerLysAspGlu                             675680685                                                                     AsnLeuLeuLysAlaPheLysGluAspLeuAspIleHisThrIleThr                              690695700                                                                     AlaAlaLysIlePheGlyValSerGluMetPheValSerGluGlnMet                              705710715720                                                                  ArgArgValGlyLysMetValAsnPheAlaIleIleTyrGlyValSer                              725730735                                                                     ProTyrGlyLeuSerLysArgIleGlyLeuSerValSerGluThrLys                              740745750                                                                     LysIleIl eAspAsnTyrPheArgTyrTyrLysGlyValPheGluTyr                             755760765                                                                     LeuLysArgMetLysAspGluAlaArgLysLysGlyTyrValThrThr                              770 775780                                                                    LeuPheGlyArgArgArgTyrIleProGlnLeuArgSerLysAsnGly                              785790795800                                                                  AsnArgValGlnGluGlyGluArgIleAla ValAsnThrProIleGln                             805810815                                                                     GlyThrAlaAlaAspIleIleLysIleAlaMetIleAsnIleHisAsn                              820825 830                                                                    ArgLeuLysLysGluAsnLeuArgSerLysMetIleLeuGlnValHis                              835840845                                                                     AspGluLeuValPheGluValProAspAsnGluLeuGluIleValLys                              850855860                                                                     AspLeuValArgAspGluMetGluAsnAlaValLysLeuAspValPro                              865870875880                                                                  LeuLysValAs pValTyrTyrGlyLysGluTrpGlu                                         885890                                                                        (2) INFORMATION FOR SEQ ID NO:13:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 33 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA probe BW33                                             (iii) HYPOTHETICAL: NO                                                       (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                      GATCGCTGCGCGTAACCACCACACCCGCCGCGC33                                           (2) INFORMATION FOR SEQ ID NO:14:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 30 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                         (ii) MOLECULE TYPE: DNA primer BW37                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                      GCGCTAGGGCGCTGGCAAGTGTAGCGGTCA30                                              (2) INFORMATION FOR SEQ ID NO:15:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 4 amino acids                                                     (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: peptide                                                   (iii) HYPOTHETICAL: YES                                                       (iv) ANTI-SENSE: NO                                                           (ix) FEATURE:                                                                 (A) NAME/KEY: Peptide                                                         (B) LOCATION: 1..4                                                            (D) OTHER INFORMATION: /label=Xaa                                             /note="Xaa =Val or Thr"                                                       (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:                                       AlaXaaTyrGly                                                                 (2) INFORMATION FOR SEQ ID NO:16:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 5 amino acids                                                     (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: peptide                                                   (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (v) FRAGMENT TYPE: internal                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:                                      HisGl uAlaTyrGly                                                              15                                                                            (2) INFORMATION FOR SEQ ID NO:17:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 5 amino acids                                                     (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: peptide                                                   (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (v) FRAGMENT TYPE: internal                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:                                       HisGluAlaTyrGlu                                                              15                                                                            (2) INFORMATION FOR SEQ ID NO:18:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 4 amino acids                                                     (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: peptide                                                   (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (v) FRAGMENT TYPE: internal                                                    (ix) FEATURE:                                                                (A) NAME/KEY: Peptide                                                         (B) LOCATION: 1..4                                                            (D) OTHER INFORMATION: /label=Xaa                                             /note="Xaa =Leu or Ile"                                                       (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:                                      XaaLeuGluThr                                                                  1                                                                             (2) INFORMATION FOR SEQ ID NO:19:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 7 amino acids                                                     (B ) TYPE: amino acid                                                         (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: peptide                                                   (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (v) FRAGMENT TYPE: internal                                                   (ix) FEATURE:                                                                 (A) NAME/KEY: Peptide                                                         (B) LOCATION: 1..7                                                            (D) OTHER INFORMATION: /label=Xaa                                             /note="Xaa =Leu or Ile"                                                       (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:                                       XaaLeuGluThrTyrLysAla                                                        15                                                                            (2) INFORMATION FOR SEQ ID NO:20:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 7 amino acids                                                     (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: peptide                                                   (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (v) FRAGMENT TYPE: internal                                                    (ix) FEATURE:                                                                (A) NAME/KEY: Peptide                                                         (B) LOCATION: 1..7                                                            (D) OTHER INFORMATION: /label=Xaa1-4                                          /note="Xaa1 =Ile or Leu or Ala; Xaa2-4, each =                                any amino acid"                                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:                                      XaaXaaXaaXaaTyrLysAla                                                         15                                                                            (2) INFORMATION FOR SEQ ID NO:21:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 22 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer MK61                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:                                      AGGACTACAACTGCCACACACC 22                                                     (2) INFORMATION FOR SEQ ID NO:22:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 38 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer RA01                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:                                      CGAGGCGCGCCAGCCCCAGGAGATCTACCAGCTCCTTG38                                      (2) INFORMATION FOR SEQ ID NO:23:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 20 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer DG29                                            (iii) HYPOTHETICAL: NO                                                       (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:                                      AGCTTATGTCTCCAAAAGCT20                                                        (2) INFORMATION FOR SEQ ID NO:24:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 16 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                         (ii) MOLECULE TYPE: DNA primer DG30                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:                                      AGCTTTTGGAGACATA16                                                            (2) INFORMATION FOR SEQ ID NO:25:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 25 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer PL10                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:                                      GGCGTACCTTTGTCTCACGGGCAAC25                                                   (2) INFORMATION FOR SEQ ID NO:26:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 28 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer FL63                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:                                      GATAAAGGCATGCTTCAGCTTGTGAACG 28                                               (2) INFORMATION FOR SEQ ID NO:27:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 27 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer FL69                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:                                      TG TACTTCTCTAGAAGCTGAACAGCAG27                                                (2) INFORMATION FOR SEQ ID NO:28:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 36 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer FL64                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:                                      CTGAAGCATGTCTTTGTCACCGGTTACTATCAATAT36                                        (2) INFORMATION FOR SEQ ID NO:29:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 18 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                         (ii) MOLECULE TYPE: DNA primer FL65                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:                                      TAGTAACCGGTGACAAAG18                                                          (2) INFORMATION FOR SEQ ID NO:30:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 31 nucleotides                                                     (B) TYPE: nucleic acid                                                       (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer FL66                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:                                      CTATGCCATGGATAGATCGCTTTCTACTTCC31                                             (2) INFORMATION FOR SEQ ID NO:31:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 31 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer FL67                                           (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:                                      CAAGCCCATGGAAACTTACAAGGCTCAAAGA 31                                            (2) INFORMATION FOR SEQ ID NO:32:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 49 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer TZA292                                         (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:                                      GTCGGCA TATGGCTCCTGCTCCTCTTGAGGAGGCCCCCTGGCCCCCGCC49                          (2) INFORMATION FOR SEQ ID NO:33:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 37 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer TZR01                                          (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:                                      GACGCAGATCTCAGCCCTTGGCGGAAAGCCAGTCCTC37                                       (2) INFORMATION FOR SEQ ID NO:34:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 49 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                         (ii) MOLECULE TYPE: DNA primer TSA288                                         (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:                                      GTCGGCATATGGCTCCTAAAGAAGCTGAGGAGGCCCCCTGGCCCCCGCC49                           (2) INFORMATION FOR SEQ ID NO:35:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 37 nucleotides                                                     (B) TYPE: nucleic acid                                                       (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer TSR01                                          (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:                                      GACGCAGATCTCAGGCCTTGGCGGAAAGCCAGTCCTC37                                       (2) INFORMATION FOR SEQ ID NO:36:                                              (i) SEQUENCE CHARACTERISTICS:                                                (A) LENGTH: 41 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer DG122                                          (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36:                                      CCTCTAAACGGCAGATCTGATATCAACCCTTGGCGGAAAGC 41                                  (2) INFORMATION FOR SEQ ID NO:37:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 48 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer TAFI285                                        (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37:                                      GTCGGCATATG ATTAAAGAACTTAATTTACAAGAAAAATTAGAAAAGG48                           (2) INFORMATION FOR SEQ ID NO:38:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 46 nucleotides                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA primer TAFR01                                         (iii) HYPOTHETICAL: NO                                                         (iv) ANTI-SENSE: NO                                                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38:                                      CCTTTACCCCAGGATCCTCATTCCCACTCTTTTCCATAATAAACAT46                          

What is claimed is:
 1. A recombinant thermostable DNA polymerase enzymewhich is characterized in that:(a) in its native form said polymerasecomprises a 5' to 3' exonuclease domain providing 5' tO 3' exonucleaseactivity, wherein said domain comprises an amino acid sequence selectedfrom the group consisting of: A(X)YG wherein X is V or T (SEQ ID NO:15), (b) said amino acid sequence is mutated in said recombinant enzymeby means other than N-terminal deletion, and (c) said recombinant enzymehas a lesser amount of 5' to 3' exonuclease activity than that of thenative form of said enzyme.
 2. The recombinant thermostable DNApolymerase enzyme of claim 1 wherein Gly of SEQ ID NO:15 is mutated. 3.The recombinant thermostable DNA polymerase enzyme of claim 2 whereinGly of SEQ ID NO:15 is mutated to Asp.
 4. The recombinant thermostableDNA polymerase enzyme of claim 1 wherein said enzyme is selected fromthe group consisting of Thermus species sps17, Thermus species Z05,Thermus aquaticus, Thermus thermophilus, Thermosipho africanus, andThermotoga maritima DNA polymerases.